En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK

Documents 65K05 4 résultats

Filtrer
Sélectionner : Tous / Aucun
Q
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

What does back propagation compute? - Pauwels, Edouard (Auteur de la Conférence) | H

Multi angle

We are interested in nonsmooth analysis of backpropagation as implemented in modern machine learning librairies, such as Tensorflow or Pytorch. First I will illustrate how blind application of
differential calculus to nonsmooth objects can be problematic, requiring a proper mathematical model.
Then I will introduce a weak notion of generalized derivative, named conservativity, and illustrate how it complies with calculus and optimization for well structured objects. We provide stability results for empirical risk minimization similar as in the smooth setting for the combination of nonsmooth automatic differentiation, minibatch stochastic approximation and first order optimization. This is joint work with Jérôme Bolte.[-]
We are interested in nonsmooth analysis of backpropagation as implemented in modern machine learning librairies, such as Tensorflow or Pytorch. First I will illustrate how blind application of
differential calculus to nonsmooth objects can be problematic, requiring a proper mathematical model.
Then I will introduce a weak notion of generalized derivative, named conservativity, and illustrate how it complies with calculus and optimization for ...[+]

65K05 ; 65K10 ; 68T99

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
2y
The performance of numerical algorithms, both regarding stability and complexity, can be understood in a unified way in terms of condition numbers. This requires to identify the appropriate geometric settings and to characterize condition in geometric ways.
A probabilistic analysis of numerical algorithms can be reduced to a corresponding analysis of condition numbers, which leads to fascinating problems of geometric probability and integral geometry. The most well known example is Smale's 17th problem, which asks to find a solution of a given system of n complex homogeneous polynomial equations in $n$ + 1 unknowns. This problem can be solved in average (and even smoothed) polynomial time.
In the course we will explain the concepts necessary to state and solve Smale's 17th problem. We also show how these ideas lead to new numerical algorithms for computing eigenpairs of matrices that provably run in average polynomial time. Making these algorithms more efficient or adapting them to structured settings are challenging and rewarding research problems. We intend to address some of these issues at the end of the course.[-]
The performance of numerical algorithms, both regarding stability and complexity, can be understood in a unified way in terms of condition numbers. This requires to identify the appropriate geometric settings and to characterize condition in geometric ways.
A probabilistic analysis of numerical algorithms can be reduced to a corresponding analysis of condition numbers, which leads to fascinating problems of geometric probability and integral ...[+]

65F35 ; 65K05 ; 68Q15 ; 15A12 ; 65F10 ; 90C51 ; 65H10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
The performance of numerical algorithms, both regarding stability and complexity, can be understood in a unified way in terms of condition numbers. This requires to identify the appropriate geometric settings and to characterize condition in geometric ways.
A probabilistic analysis of numerical algorithms can be reduced to a corresponding analysis of condition numbers, which leads to fascinating problems of geometric probability and integral geometry. The most well known example is Smale's 17th problem, which asks to find a solution of a given system of n complex homogeneous polynomial equations in $n$ + 1 unknowns. This problem can be solved in average (and even smoothed) polynomial time.
In the course we will explain the concepts necessary to state and solve Smale's 17th problem. We also show how these ideas lead to new numerical algorithms for computing eigenpairs of matrices that provably run in average polynomial time. Making these algorithms more efficient or adapting them to structured settings are challenging and rewarding research problems. We intend to address some of these issues at the end of the course.[-]
The performance of numerical algorithms, both regarding stability and complexity, can be understood in a unified way in terms of condition numbers. This requires to identify the appropriate geometric settings and to characterize condition in geometric ways.
A probabilistic analysis of numerical algorithms can be reduced to a corresponding analysis of condition numbers, which leads to fascinating problems of geometric probability and integral ...[+]

65F35 ; 65K05 ; 68Q15 ; 15A12 ; 65F10 ; 90C51 ; 65H10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Gradient descent for wide two-layer neural networks - Bach, Francis (Auteur de la Conférence) | H

Multi angle

Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions.[-]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a ...[+]

65K10 ; 65K05 ; 68W99 ; 68T99

Sélection Signaler une erreur