En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK

Documents 68Q32 3 résultats

Filtrer
Sélectionner : Tous / Aucun
Q
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Model-free control and deep learning - Bellemare, Marc (Auteur de la conférence) | CIRM H

Multi angle

In this talk I will present some recent developments in model-free reinforcement learning applied to large state spaces, with an emphasis on deep learning and its role in estimating action-value functions. The talk will cover a variety of model-free algorithms, including variations on Q-Learning, and some of the main techniques that make the approach practical. I will illustrate the usefulness of these methods with examples drawn from the Arcade Learning Environment, the popular set of Atari 2600 benchmark domains.[-]
In this talk I will present some recent developments in model-free reinforcement learning applied to large state spaces, with an emphasis on deep learning and its role in estimating action-value functions. The talk will cover a variety of model-free algorithms, including variations on Q-Learning, and some of the main techniques that make the approach practical. I will illustrate the usefulness of these methods with examples drawn from the Arcade ...[+]

68Q32 ; 91A25 ; 68T05

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Multi-armed bandits and beyond - Agrawal, Shipra (Auteur de la conférence) | CIRM H

Multi angle

In this tutorial I will discuss recent advances in theory of multi-armed bandits and reinforcement learning, in particular the upper confidence bound (UCB) and Thompson Sampling (TS) techniques for algorithm design and analysis.

60J20 ; 68Q32 ; 68T05

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
In this talk I will discuss how a variant of the classical optimal transport problem, known as the Gromov-Wasserstein distance, can help in designing learning tasks over graphs, and allow to transpose classical signal processing or data analysis tools such as dictionary learning or online change detection, for learning over those types of structured objects. Both theoretical and practical aspects will be discussed.

68Q32 ; 68T05

Sélection Signaler une erreur