En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK
1

Some recent progress for continuous-time reinforcement learning and regret analysis

Sélection Signaler une erreur
Multi angle
Auteurs : Guo, Xin (Auteur de la Conférence)
CIRM (Editeur )

Loading the player...

Résumé : Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.

Codes MSC :

    Informations sur la Vidéo

    Réalisateur : Hennenfent, Guillaume
    Langue : Anglais
    Date de publication : 27/09/2022
    Date de captation : 13/09/2022
    Sous collection : Research talks
    arXiv category : Optimization and Control
    Domaine : Control Theory & Optimization
    Format : MP4 (.mp4) - HD
    Durée : 00:52:44
    Audience : Researchers ; Graduate Students ; Doctoral Students, Post-Doctoral Students
    Download : https://videos.cirm-math.fr/2022-09-13_Guo.mp4

Informations sur la Rencontre

Nom de la rencontre : Advances in Stochastic Control and Optimal Stopping with Applications in Economics and Finance / Avancées en contrôle stochastique et arrêt optimal avec applications à l'économie et à la finance
Organisateurs de la rencontre : Buckdahn, Rainer ; Ferrari, Giorgio ; Grigorova, Miryana ; Quenez, Marie-Claire ; Riedel, Frank
Dates : 12/09/2022 - 16/09/2022
Année de la rencontre : 2022
URL Congrès : https://conferences.cirm-math.fr/2600.html

Données de citation

DOI : 10.24350/CIRM.V.19959403
Citer cette vidéo: Guo, Xin (2022). Some recent progress for continuous-time reinforcement learning and regret analysis. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19959403
URI : http://dx.doi.org/10.24350/CIRM.V.19959403

Voir aussi

Bibliographie



Imagette Video

Sélection Signaler une erreur