Déposez votre fichier ici pour le déplacer vers cet enregistrement.
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.
[-]
Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open ...
[+]
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We study an optimal reinsurance problem under the criterion of maximizing the expected utility of terminal wealth when the loss process exhibits jump clustering features and the insurance company has restricted information about the claims arrival intensity. By solving the associated filtering problem we reduce the original problem to a stochastic control problem under full information. Since the classical Hamilton-Jacobi-Bellman approach does not apply, due to the infinite dimensionality of the filter, we choose an alternative approach based on Backward Stochastic Differential Equations (BSDEs). Precisely, we characterize the value process and the optimal reinsurance strategy in terms of a BSDE driven by a marked point process. The talk is based on a joint work with M. Brachetta, G. Callegaro and C. Sgarra (arXiv:2207.05489, 2022).
[-]
We study an optimal reinsurance problem under the criterion of maximizing the expected utility of terminal wealth when the loss process exhibits jump clustering features and the insurance company has restricted information about the claims arrival intensity. By solving the associated filtering problem we reduce the original problem to a stochastic control problem under full information. Since the classical Hamilton-Jacobi-Bellman approach does ...
[+]
60G55 ; 60J60 ; 91G05 ; 91G10 ; 93E20
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We study the superhedging prices and the associated superhedging strategies for European options in a nonlinear incomplete market model with default. The underlying market model consists of one risk-free asset and one risky asset, whose price may admit a jump at the default time. The portfolio processes follow nonlinear dynamics with a nonlinear driver $f$. By using a dynamic programming approach, we first provide a dual formulation of the seller's (superhedging) price for the European option as the supremum, over a suitable set of equivalent probability measures $Q \in \mathcal{Q}$, of the $f$ - evaluation/expectation under $Q$ of the payoff. We also establish a characterization of the seller's (superhedging) price as the initial value of the minimal supersolution of a constrained backward stochastic differential equation with default. Moreover, we provide some properties of the terminal profit made by the seller, and some results related to replication and no-arbitrage issues. Our results rely on first establishing a nonlinear optional and a nonlinear predictable decomposition for processes which are $\mathcal{E}^f$-strong supermartingales under $Q$ for all $Q \in \mathcal{Q}$. Joint work with M. Grigorova and A. Sulem.
[-]
We study the superhedging prices and the associated superhedging strategies for European options in a nonlinear incomplete market model with default. The underlying market model consists of one risk-free asset and one risky asset, whose price may admit a jump at the default time. The portfolio processes follow nonlinear dynamics with a nonlinear driver $f$. By using a dynamic programming approach, we first provide a dual formulation of the ...
[+]
91G20 ; 60H10 ; 60H30 ; 93E20
Déposez votre fichier ici pour le déplacer vers cet enregistrement.