CIRM - Videos & books Library

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Dans cet exposé, je parlerai de deux travaux récents: le premier sur l'utilisation des grands modèles de langage pour la formalisation des mathématiques et le second sur l'utilisation d'architectures de réseaux de neurones graphiques pour apprendre des problèmes d'optimisation combinatoire.

68T07 ; 05C60 ; 03B35

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Since 2012, deep neural networks have led to outstanding results in many various applications, literally exceeding any previously existing methods, in texts, images, sounds, videos, graphs... They consist of a cascade of parametrized linear and non-linear operators whose parameters are optimized to achieve a fixed task. This talk addresses 4 aspects of deep learning through the lens of signal processing. First, we explain image classification in the context of supervised learning. Then, we show several empirical results that allow us to get some insights about the black box of neural networks. Third, we explain how neural networks create invariant representation: in the specific case of translation, it is possible to design predefined neural networks which are stable to translation, namely the Scattering Transform. Finally, we discuss several recent statistical learning, about the generalization and approximation properties of this deep machinery.[-]

Since 2012, deep neural networks have led to outstanding results in many various applications, literally exceeding any previously existing methods, in texts, images, sounds, videos, graphs... They consist of a cascade of parametrized linear and non-linear operators whose parameters are optimized to achieve a fixed task. This talk addresses 4 aspects of deep learning through the lens of signal processing. First, we explain image classification in ...[+]

68T07 ; 94A12

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Since 2012, deep neural networks have led to outstanding results in many various applications, literally exceeding any previously existing methods, in texts, images, sounds, videos, graphs... They consist of a cascade of parametrized linear and non-linear operators whose parameters are optimized to achieve a fixed task. This talk addresses 4 aspects of deep learning through the lens of signal processing. First, we explain image classification in the context of supervised learning. Then, we show several empirical results that allow us to get some insights about the black box of neural networks. Third, we explain how neural networks create invariant representation: in the specific case of translation, it is possible to design predefined neural networks which are stable to translation, namely the Scattering Transform. Finally, we discuss several recent statistical learning, about the generalization and approximation properties of this deep machinery.[-]

Since 2012, deep neural networks have led to outstanding results in many various applications, literally exceeding any previously existing methods, in texts, images, sounds, videos, graphs... They consist of a cascade of parametrized linear and non-linear operators whose parameters are optimized to achieve a fixed task. This talk addresses 4 aspects of deep learning through the lens of signal processing. First, we explain image classification in ...[+]

68T07 ; 94A12

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Recently a lot of progress has been made regarding the theoretical understanding of machine learning methods in particular deep learning. One of the very promising directions is the statistical approach, which interprets machine learning as a collection of statistical methods and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand phenomena such as overparametrization. The lecture series surveys this field and describes future challenges.[-]

Recently a lot of progress has been made regarding the theoretical understanding of machine learning methods in particular deep learning. One of the very promising directions is the statistical approach, which interprets machine learning as a collection of statistical methods and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand phenomena such as overparametrization. The lecture series ...[+]

68T07 ; 65Mxx

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Recently a lot of progress has been made regarding the theoretical understanding of machine learning methods in particular deep learning. One of the very promising directions is the statistical approach, which interprets machine learning as a collection of statistical methods and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand phenomena such as overparametrization. The lecture series surveys this field and describes future challenges.[-]

Recently a lot of progress has been made regarding the theoretical understanding of machine learning methods in particular deep learning. One of the very promising directions is the statistical approach, which interprets machine learning as a collection of statistical methods and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand phenomena such as overparametrization. The lecture series ...[+]

68T07 ; 65Mxx

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Finding the optimal reparametrization in shape analysis of curves or surfaces is a computationally demanding task. The problem can be phrased as an optimisation problem on the infinite dimensional group of orientation preserving diffeomorphisms $\mathrm{Diff}^+(\Omega)$, where $\Omega$ is the domain where the curves or surfaces are defined.
We consider the composition of a finite number of elementary diffeomprphisms
\begin{equation}
\label{elem_diff}
\varphi_{\ell}:=\mathrm{id}+\sum_{j=1}^M \lambda_j^{\ell} f_j,\qquad \ell=1,\dots , L,\qquad \varphi\approx \varphi_L\circ \cdots \circ\varphi_1,
\end{equation}
where $\{f_i\}_{i=1}^{\infty}$, in $T_{\mathrm{id}}\mathrm{Diff}^+(\Omega)$ is an orthonormal basis, and we optimise simultaneously on all the parameters $\{\lambda_j^{\ell}\}$ for $j=1,\dots ,M$ and $\ell=1,\dots, L$. The obtained algorithm is similar to a deep neural network and its implementation can be carried out using PyTorch. Properties and analysis of the method will be discussed as well as numerical results.[-]

Finding the optimal reparametrization in shape analysis of curves or surfaces is a computationally demanding task. The problem can be phrased as an optimisation problem on the infinite dimensional group of orientation preserving diffeomorphisms $\mathrm{Diff}^+(\Omega)$, where $\Omega$ is the domain where the curves or surfaces are defined.
We consider the composition of a finite number of elementary diffeomprphisms
\begin{equation}
\la...[+]

68T07 ; 55Q07

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

I will give an elementary introduction of basic deep learning models and training algorithms from a mathematical viewpoint. In particular, I will relate some basic deep learning models with finite element and multigrid methods. I will also touch on some advanced topics to demonstrate the potential of new mathematical insight and analysis for improving the efficiency of deep learning technologies and, in particular, for their application to numerical solution of partial differential equations.[-]

I will give an elementary introduction of basic deep learning models and training algorithms from a mathematical viewpoint. In particular, I will relate some basic deep learning models with finite element and multigrid methods. I will also touch on some advanced topics to demonstrate the potential of new mathematical insight and analysis for improving the efficiency of deep learning technologies and, in particular, for their application to ...[+]

68T07 ; 65L60

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Learning neural networks using only a small amount of data is an important ongoing research topic with tremendous potential for applications. We introduce a regularizer for the variational modeling of inverse problems in imaging based on normalizing flows, called patchNR. It involves a normalizing flow learned on patches of very few images. The subsequent reconstruction method is completely unsupervised and the same regularizer can be used for different forward operators acting on the same class of images.
By investigating the distribution of patches versus those of the whole image class, we prove that our variational model is indeed a MAP approach. Numerical examples for low-dose CT, limited-angle CT and superresolution of material images demonstrate that our method provides high quality results among unsupervised methods, but requires only very few data. Further, the appoach also works if only the low resolution image is available.
In the second part of the talk I will generalize normalizing flows to stochastic normalizing flows to improve their expressivity.Normalizing flows, diffusion normalizing flows and variational autoencoders are powerful generative models. A unified framework to handle these approaches appear to be Markov chains. We consider stochastic normalizing flows as a pair of Markov chains fulfilling some properties and show how many state-of-the-art models for data generation fit into this framework. Indeed including stochastic layers improves the expressivity of the network and allows for generating multimodal distributions from unimodal ones. The Markov chains point of view enables us to couple both deterministic layers as invertible neural networks and stochastic layers as Metropolis-Hasting layers, Langevin layers, variational autoencoders and diffusion normalizing flows in a mathematically sound way. Our framework establishes a useful mathematical tool to combine the various approaches.
Joint work with F. Altekrüger, A. Denker, P. Hagemann, J. Hertrich, P. Maass[-]

Learning neural networks using only a small amount of data is an important ongoing research topic with tremendous potential for applications. We introduce a regularizer for the variational modeling of inverse problems in imaging based on normalizing flows, called patchNR. It involves a normalizing flow learned on patches of very few images. The subsequent reconstruction method is completely unsupervised and the same regularizer can be used for ...[+]

62F15 ; 60J20 ; 60J22 ; 65C05 ; 65C40 ; 68T07

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

RBMs are generative models capable of fitting complex dataset's probability distributions. Thanks to their simple structure, they are particularly well suited for interpretability and pattern extraction, a feature particularly appealing for scientific use. In this talk, we show that RBMs operate in two distinct regimes, depending on the procedure followed to estimate the log-likelihood gradient during the training. Short sampling times fit machines that are trained to reproduce exactly the dynamics followed to train them, long samplings (as compared to the MCMC mixing time) are need to learn a good model for the data. The non-equilibrium regime should be used to generate high quality samples in short learning and sampling times, but cannot be used to extract the unnormalized data probability of the data necessary for interpretability. In practice, it is hard to extract good equilibrium models for structured datasets (which is the typical case in biological applications) due to a divergence of the Monte Carlo mixing times. In this work, we show this barrier can be surmounted using biased Monte Carlo methods. [-]

RBMs are generative models capable of fitting complex dataset's probability distributions. Thanks to their simple structure, they are particularly well suited for interpretability and pattern extraction, a feature particularly appealing for scientific use. In this talk, we show that RBMs operate in two distinct regimes, depending on the procedure followed to estimate the log-likelihood gradient during the training. Short sampling times fit ...[+]

68T07 ; 82C44 ; 65C05

Sélection Signaler une erreur

TUTELLES

PARTENAIRES

Destination de la recherche

Raccourcis

Documents 68T07 9 résultats

Maths and AI - Lelarge, Marc (Auteur de la Conférence) | CIRM H Nouveau

One signal processing view on deep learning - lecture 1 - Oyallon, Edouard (Auteur de la Conférence) | CIRM H Nouveau

One signal processing view on deep learning - lecture 2 - Oyallon, Edouard (Auteur de la Conférence) | CIRM H Nouveau

Statistical theory for deep neural networks - lecture 1 - Schmidt-Hieber, Johannes (Auteur de la Conférence) | CIRM H Nouveau

Statistical theory for deep neural networks - lecture 2 - Schmidt-Hieber, Johannes (Auteur de la Conférence) | CIRM H Nouveau

Deep learning of diffeomorphisms for optimal reparametrizations of shapes - Celledoni, Elena (Auteur de la Conférence) | CIRM H Nouveau

A mathematical introduction to deep learning - Xu, Jinchao (Auteur de la Conférence) | CIRM H Nouveau

Stochastic normalizing flows and the power of patches in inverse problems - Steidl, Gabriele (Auteur de la Conférence) | CIRM H Nouveau

Biased Monte Carlo sampling in RBMs - Seoane, Beatriz (Auteur de la Conférence) | CIRM H Nouveau