En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK
1

Some recent progress for continuous-time reinforcement learning and regret analysis

Bookmarks Report an error
Multi angle
Authors : Guo, Xin (Author of the conference)
CIRM (Publisher )

Loading the player...

Abstract : Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.

MSC Codes :

    Information on the Video

    Film maker : Hennenfent, Guillaume
    Language : English
    Available date : 27/09/2022
    Conference Date : 13/09/2022
    Subseries : Research talks
    arXiv category : Optimization and Control
    Mathematical Area(s) : Control Theory & Optimization
    Format : MP4 (.mp4) - HD
    Video Time : 00:52:44
    Targeted Audience : Researchers ; Graduate Students ; Doctoral Students, Post-Doctoral Students
    Download : https://videos.cirm-math.fr/2022-09-13_Guo.mp4

Information on the Event

Event Title : Advances in Stochastic Control and Optimal Stopping with Applications in Economics and Finance / Avancées en contrôle stochastique et arrêt optimal avec applications à l'économie et à la finance
Event Organizers : Buckdahn, Rainer ; Ferrari, Giorgio ; Grigorova, Miryana ; Quenez, Marie-Claire ; Riedel, Frank
Dates : 12/09/2022 - 16/09/2022
Event Year : 2022
Event URL : https://conferences.cirm-math.fr/2600.html

Citation Data

DOI : 10.24350/CIRM.V.19959403
Cite this video as: Guo, Xin (2022). Some recent progress for continuous-time reinforcement learning and regret analysis. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19959403
URI : http://dx.doi.org/10.24350/CIRM.V.19959403

See Also

Bibliography



Imagette Video

Bookmarks Report an error