Auteurs : Pagès, Gilles (Auteur de la Conférence)
CIRM (Editeur )
Résumé :
Optimal vector quantization has been originally introduced in Signal processing as a discretization method of random signals, leading to an optimal trade-off between the speed of transmission and the quality of the transmitted signal. In machine learning, similar methods applied to a dataset are the historical core of unsupervised classification methods known as “clustering”. In both case it appears as an optimal way to produce a set of weighted prototypes (or codebook) which makes up a kind of skeleton of a dataset, a signal and more generally, from a mathematical point of view, of a probability distribution.
Quantization has encountered in recent years a renewed interest in various application fields like automatic classification, learning algorithms, optimal stopping and stochastic control, Backward SDEs and more generally numerical probability. In all these various applications, practical implementation of such clustering/quantization methods more or less rely on two procedures (and their countless variants): the Competitive Learning Vector Quantization $(CLV Q)$ which appears as a stochastic gradient descent derived from the so-called distortion potential and the (randomized) Lloyd's procedure (also known as k- means algorithm, nu ees dynamiques) which is but a fixed point search procedure. Batch version of those procedures can also be implemented when dealing with a dataset (or more generally a discrete distribution).
In a more formal form, if is probability distribution on an Euclidean space $\mathbb{R}^d$, the optimal quantization problem at level $N$ boils down to exhibiting an $N$-tuple $(x_{1}^{*}, . . . , x_{N}^{*})$, solution to
argmin$_{(x1,\dotsb,x_N)\epsilon(\mathbb{R}^d)^N} \int_{\mathbb{R}^d 1\le i\le N} \min |x_i-\xi|^2 \mu(d\xi)$
and its distribution i.e. the weights $(\mu(C(x_{i}^{*}))_{1\le i\le N}$ where $(C(x_{i}^{*})$ is a (Borel) partition of $\mathbb{R}^d$ satisfying
$C(x_{i}^{*})\subset \lbrace\xi\epsilon\mathbb{R}^d :|x_{i}^{*} -\xi|\le_{1\le j\le N} \min |x_{j}^{*}-\xi|\rbrace$.
To produce an unsupervised classification (or clustering) of a (large) dataset $(\xi_k)_{1\le k\le n}$, one considers its empirical measure
$\mu=\frac{1}{n}\sum_{k=1}^{n}\delta_{\xi k}$
whereas in numerical probability $\mu = \mathcal{L}(X)$ where $X$ is an $\mathbb{R}^d$-valued simulatable random vector. In both situations, $CLV Q$ and Lloyd's procedures rely on massive sampling of the distribution $\mu$.
As for clustering, the classification into $N$ clusters is produced by the partition of the dataset induced by the Voronoi cells $C(x_{i}^{*}), i = 1, \dotsb, N$ of the optimal quantizer.
In this second case, which is of interest for solving non linear problems like Optimal stopping problems (variational inequalities in terms of PDEs) or Stochastic control problems (HJB equations) in medium dimensions, the idea is to produce a quantization tree optimally fitting the dynamics of (a time discretization) of the underlying structure process.
We will explore (briefly) this vast panorama with a focus on the algorithmic aspects where few theoretical results coexist with many heuristics in a burgeoning literature. We will present few simulations in two dimensions.
Codes MSC :
62L20
- Stochastic approximation
65C05
- Monte Carlo methods
93E25
- Computational methods in stochastic control
94A12
- Signal theory (characterization, reconstruction, filtering, etc.)
91G60
- Numerical methods in mathematical finance
Ressources complémentaires :
http://smai.emath.fr/cemracs/cemracs17/Slides/pages.pdf
|
Informations sur la Rencontre
Nom de la rencontre : CEMRACS - Summer school: Numerical methods for stochastic models: control, uncertainty quantification, mean-field / CEMRACS - École d'été : Méthodes numériques pour équations stochastiques : contrôle, incertitude, champ moyen Organisateurs de la rencontre : Bouchard, Bruno ; Chassagneux, Jean-François ; Delarue, François ; Gobet, Emmanuel ; Lelong, Jérôme Dates : 17/07/17 - 25/08/17
Année de la rencontre : 2017
URL Congrès : http://conferences.cirm-math.fr/1556.html
DOI : 10.24350/CIRM.V.19199603
Citer cette vidéo:
Pagès, Gilles (2017). Optimal vector quantization: from signal processing to clustering and numerical probability. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19199603
URI : http://dx.doi.org/10.24350/CIRM.V.19199603
|
Voir aussi
-
[Multi angle]
Forward and backward simulation of Euler scheme
/ Auteur de la Conférence Gobet, Emmanuel.
-
[Multi angle]
Stochastic variational inequalities for random mechanics
/ Auteur de la Conférence Mertz, Laurent.
-
[Multi angle]
Mean field type control with congestion
/ Auteur de la Conférence Laurière, Mathieu.
-
[Multi angle]
On the discretization of some nonlinear Fokker-Planck-Kolmogorov equations and applications
/ Auteur de la Conférence Silva Álvarez, Francisco José.
-
[Multi angle]
Project evaluation under uncertainty
/ Auteur de la Conférence Zubelli, Jorge P..
-
[Multi angle]
Some asymptotic results about American options and volativity
/ Auteur de la Conférence De Marco, Stefano.
-
[Multi angle]
Projected particle methods for solving McKean Vlasov SDEs
/ Auteur de la Conférence Belomestny, Denis.
-
[Multi angle]
Splitting algorithm for nested events
/ Auteur de la Conférence Goudenège, Ludovic.
-
[Multi angle]
Lecture 2: Introduction to HPC - MPI: design of parallel program and MPI
/ Auteur de la Conférence Lelong, Jérôme.
-
[Multi angle]
Lecture 1: Introduction to HPC, random generation, and OpenMP
/ Auteur de la Conférence Lelong, Jérôme.
-
[ Post-edited]
Numerical methods for mean field games - Lecture 2: Monotone finite difference schemes
/ Auteur de la Conférence Achdou, Yves.
-
[Multi angle]
Numerical methods for mean field games - Lecture 3: Variational MFG and related algorithms for solving the discrete system of nonlinear equations
/ Auteur de la Conférence Achdou, Yves.
-
[Multi angle]
Numerical methods for mean field games - Lecture 1: Introduction to the system of PDEs and its interpretation. Uniqueness of classical solutions
/ Auteur de la Conférence Achdou, Yves.
-
[Multi angle]
Metamodels for uncertainty quantification and reliability analysis
/ Auteur de la Conférence Marelli, Stefano.
-
[Multi angle]
Global sensitivity analysis in stochastic systems
/ Auteur de la Conférence Le Maître, Olivier.
-
[Multi angle]
Subsurface flow with uncertainty : applications and numerical analysis issues
/ Auteur de la Conférence Charrier, Julia.
-
[Multi angle]
Dynamic formulations of optimal transportation and variational MFGs
/ Auteur de la Conférence Benamou, Jean-David.
-
[Multi angle]
Least squares regression Monte Carlo for approximating BSDES and semilinear PDES
/ Auteur de la Conférence Turkedjiev, Plamen.
-
[Multi angle]
Bandits in auctions (& more)
/ Auteur de la Conférence Perchet, Vianney.
-
[Multi angle]
Multilevel and multi-index sampling methods with applications - Lecture 2: Multilevel and Multi-index Monte Carlo methods for the McKean-Vlasov equation
/ Auteur de la Conférence Tempone, Raul.
-
[Multi angle]
Multilevel and multi-index sampling methods with applications - Lecture 1: Adaptive strategies for Multilevel Monte Carlo
/ Auteur de la Conférence Tempone, Raul.
-
[Multi angle]
Model-free control and deep learning
/ Auteur de la Conférence Bellemare, Marc.
-
[Multi angle]
The Metropolis Hastings algorithm: introduction and optimal scaling of the transient phase
/ Auteur de la Conférence Jourdain, Benjamin.
-
[Multi angle]
Branching for PDEs
/ Auteur de la Conférence Warin, Xavier.
-
[Multi angle]
On the interplay between kinetic theory and game theory
/ Auteur de la Conférence Degond, Pierre.
-
[Multi angle]
Particle algorithm for McKean SDE: a short review on numerical analysis
/ Auteur de la Conférence Bossy, Mireille.
-
[Multi angle]
Cubature methods and applications
/ Auteur de la Conférence Crisan, Dan.
-
[Multi angle]
Capacity expansion games with application to competition in power generation investments
/ Auteur de la Conférence Aïd, René.
-
[Multi angle]
An introduction to BSDE
/ Auteur de la Conférence Imkeller, Peter.
-
[Multi angle]
Mean field games with major and minor players
/ Auteur de la Conférence Carmona, René.
Bibliographie
- Duflo, M. (1996). Algorithmes stochastiques. Paris: Springer-Verlag - http://www.springer.com/fr/book/9783540606994
- Gersho, A., & Gray, R.M. (1992). Vector Quantization and Signal Compression. Boston: Kluwer Academic Publishers - http://dx.doi.org/10.1007/978-1-4615-3626-0
- Graf, S., & Luschgy, H. (2000). Foundations of quantization for probability distributions. Berlin: Springer - http://dx.doi.org/10.1007/BFb0103945
- Kushner, H., & Yin, G.G. (2003). Stochastic approximation and recursive algorithms and applications. 2nd ed. New York: Springer - http://dx.doi.org/10.1007/b97441
- Pagès, G. (2015). Introduction to vector quantization and its applications for numerics. ESAIM Proceedings and Surveys, 48, 29-79 - http://dx.doi.org/10.1051/proc/201448002
- Pagès, G., & Printems, J. (2009). Optimal quantization for finance: from random vectors to stochastic processes. In A. Bensoussan, & Q. Zhang (Eds.), Handbook of numerical analysis. XV, Special volume, Mathematical modeling and numerical methods in finance (pp. 595-649). Amsterdam: Elsevier/North-Holland - http://dx.doi.org/10.1016/S1570-8659(08)00015-x
- Pagès, G., & Wilbertz, B. (2011). Optimal Delaunay and Voronoi quantization schemes for pricing American style options. - https://hal.archives-ouvertes.fr/hal-00572709