Déposez votre fichier ici pour le déplacer vers cet enregistrement.

## Multi angle  Big Data: Tremendous challenges, great solutions Bougé, Luc (Auteur de la Conférence) | CIRM (Editeur )

L'apparition des "Big Data" est en train de modifier profondément notre compréhension du traitement algorithmique de l'information. Le centre de gravité s'est déplacé du calcul vers les données, et le passage à l'échelle est devenu une notion centrale. En particulier, la prise en compte de la localisation géographique des données, du coût de leur déplacement et de leur disponibilité sont devenus des facteurs majeurs de la conception des applications.
Cette nouvelle vision "centrée sur les données" et "consciente de l'échelle" (data-centric, scaling-aware) renouvelle complètement la problématique de l'algorithmique et de la programmation, à la fois dans les outils théoriques utilisés et aussi dans les méthodologies pratiques mises en oeuvre. Cet exposé présentera quelques-uns des aspects ainsi touchés et proposera des pistes pour adapter l'enseignement de l'informatique à ce nouveau paysage.
L'apparition des "Big Data" est en train de modifier profondément notre compréhension du traitement algorithmique de l'information. Le centre de gravité s'est déplacé du calcul vers les données, et le passage à l'échelle est devenu une notion centrale. En particulier, la prise en compte de la localisation géographique des données, du coût de leur déplacement et de leur disponibilité sont devenus des facteurs majeurs de la conception des ...

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

## Multi angle  Large-scale machine learning and convex optimization 2/2 Bach, Francis (Auteur de la Conférence) | CIRM (Editeur )

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. Given n observations/iterations, the optimal convergence rates of these algorithms are $O(1/\sqrt{n})$ for general convex functions and reaches $O(1/n)$ for strongly-convex functions. In this tutorial, I will first present the classical results in stochastic approximation and relate them to classical optimization and statistics results. I will then show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of $O(1/n)$ without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes ...

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

## Multi angle  Large-scale machine learning and convex optimization 1/2 Bach, Francis (Auteur de la Conférence) | CIRM (Editeur )

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. Given n observations/iterations, the optimal convergence rates of these algorithms are $O(1/\sqrt{n})$ for general convex functions and reaches $O(1/n)$ for strongly-convex functions. In this tutorial, I will first present the classical results in stochastic approximation and relate them to classical optimization and statistics results. I will then show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of $O(1/n)$ without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes ...

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

## Multi angle  Random forests variable importances: towards a better understanding and large-scale feature selection Geurts, Pierre (Auteur de la Conférence) | CIRM (Editeur )

Random forests are among the most popular supervised machine learning methods. One of their most practically useful features is the possibility to derive from the ensemble of trees an importance score for each input variable that assesses its relevance for predicting the output. These importance scores have been successfully applied on many problems, notably in bioinformatics, but they are still not well understood from a theoretical point of view. In this talk, I will present our recent works towards a better understanding, and consequently a better exploitation, of these measures. In the first part of my talk, I will present a theoretical analysis of the mean decrease impurity importance in asymptotic ensemble and sample size conditions. Our main results include an explicit formulation of this measure in the case of ensemble of totally randomized trees and a discussion of the conditions under which this measure is consistent with respect to a common definition of variable relevance. The second part of the talk will be devoted to the analysis of finite tree ensembles in a constrained framework that assumes that each tree can be built only from a subset of variables of fixed size. This setting is motivated by very high dimensional problems, or embedded systems, where one can not assume that all variables can fit into memory. We first consider a simple method that grows each tree on a subset of variables randomly and uniformly selected among all variables. We analyse the consistency and convergence rate of this method for the identification of all relevant variables under various problem and algorithm settings. From this analysis, we then motivate and design a modified variable sampling mechanism that is shown to significantly improve convergence in several conditions. Random forests are among the most popular supervised machine learning methods. One of their most practically useful features is the possibility to derive from the ensemble of trees an importance score for each input variable that assesses its relevance for predicting the output. These importance scores have been successfully applied on many problems, notably in bioinformatics, but they are still not well understood from a theoretical point of ...

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

## Multi angle  Individualized rank aggregation using nuclear norm regularization Neghaban, Sahand (Auteur de la Conférence) | CIRM (Editeur )

In recent years rank aggregation has received significant attention from the machine learning community. The goal of such a problem is to combine the (partially revealed) preferences over objects of a large population into a single, relatively consistent ordering of those objects. However, in many cases, we might not want a single ranking and instead opt for individual rankings. We study a version of the problem known as collaborative ranking. In this problem we assume that individual users provide us with pairwise preferences (for example purchasing one item over another). From those preferences we wish to obtain rankings on items that the users have not had an opportunity to explore. The results here have a very interesting connection to the standard matrix completion problem. We provide a theoretical justification for a nuclear norm regularized optimization procedure, and provide high-dimensional scaling results that show how the error in estimating user preferences behaves as the number of observations increase.

rank aggregation - nuclear norm - rank centrality - convex optimization - regularized $M$-estimation - matrix completion - collaborative filtering
In recent years rank aggregation has received significant attention from the machine learning community. The goal of such a problem is to combine the (partially revealed) preferences over objects of a large population into a single, relatively consistent ordering of those objects. However, in many cases, we might not want a single ranking and instead opt for individual rankings. We study a version of the problem known as collaborative ranking. ...

#### Filtrer

##### Codes MSC

Z
="cb8" name="cid[]" value="19281000124910092829" />

## Multi angle  New hints from the reward system Apicella, Paul (Auteur de la Conférence) ; Loewenstein, Yonatan (Auteur de la Conférence) | CIRM (Editeur )

Start the video and click on the track button in the timeline to move to talk 1, 2 and to the discussion.

- Talk 1: Paul Apicella - Striatal dopamine and acetylcholine mechanisms involved in reward-related learning

The midbrain dopamine system has been identified as a major component of motivation and reward processing. One of its main targets is the striatum which plays an important role in motor control and learning functions. Other subcortical neurons work in parallel with dopamine neurons. In particular, striatal cholinergic interneurons participate in signaling the reward-related significance of stimuli and they may act in concert with dopamine to encode prediction error signals and control the learning of stimulus-response associations. Recent studies have revealed functional cooperativity between these two neuromodulatory systems of a complexity far greater than previously appreciated. In this talk I will review the difference and similarities between dopamine and acetylcholine reward-signaling systems, the possible nature of reward representation in each system, and discuss the involvement of striatal dopamine-acetylcholine interactions during leaning and behavior.

- Talk 2: Yonatan Loewenstein - Modeling operant learning: from synaptic plasticity to behavior

- Discussion with Paul Apicella and Yonatan Loewenstein
Start the video and click on the track button in the timeline to move to talk 1, 2 and to the discussion.

- Talk 1: Paul Apicella - Striatal dopamine and acetylcholine mechanisms involved in reward-related learning

The midbrain dopamine system has been identified as a major component of motivation and reward processing. One of its main targets is the striatum which plays an important role in motor control and learning functions. Other ...