F Nous contacter

Multi angle

H 1 Gradient descent for wide two-layer neural networks

Auteurs : Bach, Francis (Auteur de la Conférence)
CIRM (Editeur )

    Loading the player...

    Résumé : Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions.

    Keywords : optimization; neural networks; machine learning

    Codes MSC :
    65K05 - Mathematical programming methods
    65K10 - Optimization and variational techniques
    68T99 - None of the above but in this section
    68W99 - None of the above

      Informations sur la Vidéo

      Réalisateur : Hennenfent, Guillaume
      Langue : Anglais
      Date de publication : 06/04/2020
      Date de captation : 12/03/2020
      Collection : Computer Science ; Control Theory and Optimization ; Probability and Statistics
      Sous collection : Research talks
      Format : MP4
      Domaine : Computer Science ; Control Theory & Optimization ; Probability & Statistics
      Durée : 00:47:31
      Audience : Chercheurs ; Doctorants , Post - Doctorants
      Download : https://videos.cirm-math.fr/2020-03-12_Bach.mp4/

    Informations sur la rencontre

    Nom de la rencontre : Optimization for Machine Learning / Optimisation pour l’apprentissage automatique
    Organisateurs de la rencontre : Boyer, Claire ; d'Aspremont, Alexandre ; Gramfort, Alexandre ; Salmon, Joseph ; Villar, Soledad
    Dates : 09/03/2020 - 13/03/2020
    Année de la rencontre : 2020
    URL Congrès : https://conferences.cirm-math.fr/2133.html

    Citation Data

    DOI : 10.24350/CIRM.V.19622703
    Cite this video as: Bach, Francis (2020). Gradient descent for wide two-layer neural networks. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.19622703
    URI : http://dx.doi.org/10.24350/CIRM.V.19622703

    Voir aussi


Imagette Video