CIRM - Videos & books Library - The linear algebra of Large Language Models

Multi angle

Auteurs : Saad, Yousef (Auteur de la conférence)
CIRM (Editeur )

Loading the player...

Résumé : In an era where Artificial Intelligence (AI) is permeating virtuallly every single field of science and engineering, it is becoming critical to members of the numerical linear algebra community to understand and embrace AI , and to contribute to its advancement, and more broadly to the advancement of machine learning. What is fascinating and rather encouraging is that Numerical Linear Algebra (NLA) is at the core of machine learning and AI. In this talk we will give an overview of Deep Learning with an emphasis on Large Language Models (LLMs) and Transformers [3, 4]. The very first step of LLMs is to convert the problem into one that can he exploited by numerical methods, or to be more accurate, by optimization techniques. All AI methods rely almost entirely on essentially 4 ingredients: data, optimization methods, statistical intuition, and linear algebra. Thus, the first task is to map words or sentences into tokens which are then imbedded into Euclidean spaces. From there on, the models refer to vectors and matrices. We will show a few examples of important developments in ML, that were heavily based on linear algebra ideas. Among these, we will briefly discuss LoRa [1] a technique in which low-rank approximation was used to reduce computational cost in some models, leading to gains of a few orders of magnitude. Another contribution that used purely algebraic arguments and that had a major impact on LLMs is the article [2]. Here the main discovery is that the nonlinear ""self-attention"" in LLMs can be approximated linearly, resulting in huge savings in computations, as the computational complexity was decreased from $O\left(n^2\right)$ to $O(n)$.The talk will be mostly a survey of known recent methods in AI with the primary goal of unraveling the mathematics of Transformers. A secondary goal is to initiate a discussion on the issue of how NLA specialitst can participate in AI research.

Mots-Clés : Numerical Linear Algebra; Large Language Models; Acceleration methods

Codes MSC :
65F99 - None of the above but in this section
68T99 - None of the above but in this section

Informations sur la Vidéo

Réalisateur :

Recanzone, Luca

Langue :

Date de Publication :

Date de Captation :

Sous Collection :

Catégorie arXiv :

Domaine(s) :

Format :

Durée :

Audience :

Download :

https://videos.cirm-math.fr/2024-09-17_saad.mp4

Informations sur la Rencontre

Nom de la Rencontre : Numerical Linear Algebra / Algèbre Linéaire Numérique
Organisateurs de la Rencontre : Brezinski, Claude ; Chehab, Jean-Paul ; Redivo-Zaglia, Michela ; Rodriguez, Giuseppe ; Sadok, Hassane
Dates : 16/09/2024 - 20/09/2024
Année de la rencontre : 2024
URL de la Rencontre : https://conferences.cirm-math.fr/3064.html

Données de citation

DOI : 10.24350/CIRM.V.20246503
Citer cette vidéo: Saad, Yousef (2024). The linear algebra of Large Language Models. CIRM. Audiovisual resource. doi:10.24350/CIRM.V.20246503
URI : http://dx.doi.org/10.24350/CIRM.V.20246503

Voir Aussi

[Multi angle] Convergence analysis and parameter choice for the iterated Arnoldi-Tikhonov method / Auteur de la conférence Reichel, Lothar.
[Multi angle] When is the resolvent like a rank one matrix ? / Auteur de la conférence Greenbaum, Anne.
[Multi angle] What is new in domain decomposition ? / Auteur de la conférence Gander, Martin.

Bibliographie

HU, Edward J., SHEN, Yelong, WALLIS, Phillip, et al. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021. - https://doi.org/10.48550/arXiv.2106.09685

KATHAROPOULOS, Angelos, VYAS, Apoorv, PAPPAS, Nikolaos, et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. arXiv preprint arXiv:2006.16236, 2020. - https://doi.org/10.48550/arXiv.2006.16236

MURPHY, Kevin P. Probabilistic machine learning: an introduction. MIT press, 2022. -

ZHANG, Aston, LIPTON, Zachary C.,LI, Mu and SMOLA, Alexander J. . Dive into Deep Learning. Cambridge University Press, 2023. - https://D2L.ai

Saad2.jpg

Sélection Signaler une erreur

TUTELLES

PARTENAIRES

Destination de la recherche

Raccourcis

Voir Aussi