• D

F Nous contacter

0

# Documents  68T05 | enregistrements trouvés : 15

O

Sélection courante (0) : Tout sélectionner / Tout déselectionner

P Q

## Post-edited  Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs Massoulié, Laurent (Auteur de la Conférence) | CIRM (Editeur )

A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. In this work, we study the largest eigenvalues of the non-backtracking matrix of the Erdos-Renyi random graph and of the Stochastic Block Model in the regime where the number of edges is proportional to the number of vertices. Our results confirm the "spectral redemption" conjecture that community detection can be made on the basis of the leading eigenvectors above the feasibility threshold. A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. In this work, we ...

## Post-edited  Detection theory and novelty filters Morel, Jean-Michel (Auteur de la Conférence) | CIRM (Editeur )

In this presentation based on on-line demonstrations of algorithms and on the examination of several practical examples, I will reflect on the problem of modeling a detection task in images. I will place myself in the (very frequent) case where the detection task can not be formulated in a Bayesian framework or, rather equivalently that can not be solved by simultaneous learning of the model of the object and that of the background. (In the case where there are plenty of examples of the background and of the object to be detected, the neural networks provide a practical answer, but without explanatory power). Nevertheless for the detection without "learning", I will show that we can not avoid building a background model, or possibly learn it. But this will not require many examples.

Joint works with Axel Davy, Tristan Dagobert, Agnes Desolneux, Thibaud Ehret.
In this presentation based on on-line demonstrations of algorithms and on the examination of several practical examples, I will reflect on the problem of modeling a detection task in images. I will place myself in the (very frequent) case where the detection task can not be formulated in a Bayesian framework or, rather equivalently that can not be solved by simultaneous learning of the model of the object and that of the background. (In the case ...

## Multi angle  New hints from the reward system Apicella, Paul (Auteur de la Conférence) ; Loewenstein, Yonatan (Auteur de la Conférence) | CIRM (Editeur )

Start the video and click on the track button in the timeline to move to talk 1, 2 and to the discussion.

- Talk 1: Paul Apicella - Striatal dopamine and acetylcholine mechanisms involved in reward-related learning

The midbrain dopamine system has been identified as a major component of motivation and reward processing. One of its main targets is the striatum which plays an important role in motor control and learning functions. Other subcortical neurons work in parallel with dopamine neurons. In particular, striatal cholinergic interneurons participate in signaling the reward-related significance of stimuli and they may act in concert with dopamine to encode prediction error signals and control the learning of stimulus­response associations. Recent studies have revealed functional cooperativity between these two neuromodulatory systems of a complexity far greater than previously appreciated. In this talk I will review the difference and similarities between dopamine and acetylcholine reward-signaling systems, the possible nature of reward representation in each system, and discuss the involvement of striatal dopamine-acetylcholine interactions during leaning and behavior.

- Talk 2: Yonatan Loewenstein - Modeling operant learning: from synaptic plasticity to behavior

- Discussion with Paul Apicella and Yonatan Loewenstein
Start the video and click on the track button in the timeline to move to talk 1, 2 and to the discussion.

- Talk 1: Paul Apicella - Striatal dopamine and acetylcholine mechanisms involved in reward-related learning

The midbrain dopamine system has been identified as a major component of motivation and reward processing. One of its main targets is the striatum which plays an important role in motor control and learning functions. Other ...

## Multi angle  Large-scale machine learning and convex optimization 1/2 Bach, Francis (Auteur de la Conférence) | CIRM (Editeur )

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. Given n observations/iterations, the optimal convergence rates of these algorithms are $O(1/\sqrt{n})$ for general convex functions and reaches $O(1/n)$ for strongly-convex functions. In this tutorial, I will first present the classical results in stochastic approximation and relate them to classical optimization and statistics results. I will then show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of $O(1/n)$ without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes ...

## Multi angle  Large-scale machine learning and convex optimization 2/2 Bach, Francis (Auteur de la Conférence) | CIRM (Editeur )

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. Given n observations/iterations, the optimal convergence rates of these algorithms are $O(1/\sqrt{n})$ for general convex functions and reaches $O(1/n)$ for strongly-convex functions. In this tutorial, I will first present the classical results in stochastic approximation and relate them to classical optimization and statistics results. I will then show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of $O(1/n)$ without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes ...

## Multi angle  Model-free control and deep learning Bellemare, Marc (Auteur de la Conférence) | CIRM (Editeur )

In this talk I will present some recent developments in model-free reinforcement learning applied to large state spaces, with an emphasis on deep learning and its role in estimating action-value functions. The talk will cover a variety of model-free algorithms, including variations on Q-Learning, and some of the main techniques that make the approach practical. I will illustrate the usefulness of these methods with examples drawn from the Arcade Learning Environment, the popular set of Atari 2600 benchmark domains. In this talk I will present some recent developments in model-free reinforcement learning applied to large state spaces, with an emphasis on deep learning and its role in estimating action-value functions. The talk will cover a variety of model-free algorithms, including variations on Q-Learning, and some of the main techniques that make the approach practical. I will illustrate the usefulness of these methods with examples drawn from the Arcade ...

## Multi angle  Big Data: Tremendous challenges, great solutions Bougé, Luc (Auteur de la Conférence) | CIRM (Editeur )

L'apparition des "Big Data" est en train de modifier profondément notre compréhension du traitement algorithmique de l'information. Le centre de gravité s'est déplacé du calcul vers les données, et le passage à l'échelle est devenu une notion centrale. En particulier, la prise en compte de la localisation géographique des données, du coût de leur déplacement et de leur disponibilité sont devenus des facteurs majeurs de la conception des applications.
Cette nouvelle vision "centrée sur les données" et "consciente de l'échelle" (data-centric, scaling-aware) renouvelle complètement la problématique de l'algorithmique et de la programmation, à la fois dans les outils théoriques utilisés et aussi dans les méthodologies pratiques mises en oeuvre. Cet exposé présentera quelques-uns des aspects ainsi touchés et proposera des pistes pour adapter l'enseignement de l'informatique à ce nouveau paysage.
L'apparition des "Big Data" est en train de modifier profondément notre compréhension du traitement algorithmique de l'information. Le centre de gravité s'est déplacé du calcul vers les données, et le passage à l'échelle est devenu une notion centrale. En particulier, la prise en compte de la localisation géographique des données, du coût de leur déplacement et de leur disponibilité sont devenus des facteurs majeurs de la conception des ...

## Multi angle  Random forests variable importances: towards a better understanding and large-scale feature selection Geurts, Pierre (Auteur de la Conférence) | CIRM (Editeur )

Random forests are among the most popular supervised machine learning methods. One of their most practically useful features is the possibility to derive from the ensemble of trees an importance score for each input variable that assesses its relevance for predicting the output. These importance scores have been successfully applied on many problems, notably in bioinformatics, but they are still not well understood from a theoretical point of view. In this talk, I will present our recent works towards a better understanding, and consequently a better exploitation, of these measures. In the first part of my talk, I will present a theoretical analysis of the mean decrease impurity importance in asymptotic ensemble and sample size conditions. Our main results include an explicit formulation of this measure in the case of ensemble of totally randomized trees and a discussion of the conditions under which this measure is consistent with respect to a common definition of variable relevance. The second part of the talk will be devoted to the analysis of finite tree ensembles in a constrained framework that assumes that each tree can be built only from a subset of variables of fixed size. This setting is motivated by very high dimensional problems, or embedded systems, where one can not assume that all variables can fit into memory. We first consider a simple method that grows each tree on a subset of variables randomly and uniformly selected among all variables. We analyse the consistency and convergence rate of this method for the identification of all relevant variables under various problem and algorithm settings. From this analysis, we then motivate and design a modified variable sampling mechanism that is shown to significantly improve convergence in several conditions. Random forests are among the most popular supervised machine learning methods. One of their most practically useful features is the possibility to derive from the ensemble of trees an importance score for each input variable that assesses its relevance for predicting the output. These importance scores have been successfully applied on many problems, notably in bioinformatics, but they are still not well understood from a theoretical point of ...

## Multi angle  A review of different geometries for the training of neural networks Malago, Luigi (Auteur de la Conférence) | CIRM (Editeur )

Neural networks consist of a variegate class of computational models, used in machine learning for both supervised and unsupervised learning. Several topologies of networks have been proposed in the literature, since the preliminary work from the late 50s, including models based on undirected probabilistic graphical models, such as (Restricted) Boltzmann Machines, and on multi-layer feed-forward computational graphs. The training of a neural network is usually performed by the minimization of a cost function, such as the negative log-likelihood. During the talk we will review alternative geometries used to describe the space of the functions encoded by a neural network, parametrized by its connection weights, and the implications on the optimization of the cost function during training, from the perspective of Riemannian optimization. In the first part of the presentation, we will introduce a probabilistic interpretation for neural networks, which goes back to the work of Amari and coauthors from the 90s, and which is based on the use of the Fisher-Rao metric studied in Information Geometry. In this framework, the weights of a Boltzmann Machine, and similarly for feed-forward neural networks, are interpreted as the parameters of a (joint) statistical model for the observed, and possibly latent, variables. In the second part of the talk, we will review other approaches, motivated by invariant principles in neural networks and not explicitly based on probabilistic models, to the definition of alternative geometries for the space of the parameters of a neural network. The use of alternative non-Euclidean geometries has direct impact on the training algorithms, indeed by modeling the space of the functions associated to a neural network as a Riemannian manifold determines a dependence of the gradient on the choice of metric tensor. We conclude the presentation by reviewing some recently proposed training algorithm for neural networks, based on Riemannian optimization algorithms. Neural networks consist of a variegate class of computational models, used in machine learning for both supervised and unsupervised learning. Several topologies of networks have been proposed in the literature, since the preliminary work from the late 50s, including models based on undirected probabilistic graphical models, such as (Restricted) Boltzmann Machines, and on multi-layer feed-forward computational graphs. The training of a neural ...

## Multi angle  Still searching the engram: should we? Mongillo, Gianluigi (Auteur de la Conférence) ; Segal, Menahem (Auteur de la Conférence) | CIRM (Editeur )

Start the video and click on the track button in the timeline to move to talk 1, 2 and to the discussion.

- Talk 1: Gianluigi Mongillo - Inhibitory connectivity defines the realm of excitatory plasticity

- Talk 2: Menahem Segal - Determinants of network activity: Lessons from dissociated hippocampal lectures

- Discussion with Gianluigi Mongillo and Menahem Segal

## Multi angle  Individualized rank aggregation using nuclear norm regularization Neghaban, Sahand (Auteur de la Conférence) | CIRM (Editeur )

In recent years rank aggregation has received significant attention from the machine learning community. The goal of such a problem is to combine the (partially revealed) preferences over objects of a large population into a single, relatively consistent ordering of those objects. However, in many cases, we might not want a single ranking and instead opt for individual rankings. We study a version of the problem known as collaborative ranking. In this problem we assume that individual users provide us with pairwise preferences (for example purchasing one item over another). From those preferences we wish to obtain rankings on items that the users have not had an opportunity to explore. The results here have a very interesting connection to the standard matrix completion problem. We provide a theoretical justification for a nuclear norm regularized optimization procedure, and provide high-dimensional scaling results that show how the error in estimating user preferences behaves as the number of observations increase.

rank aggregation - nuclear norm - rank centrality - convex optimization - regularized $M$-estimation - matrix completion - collaborative filtering
In recent years rank aggregation has received significant attention from the machine learning community. The goal of such a problem is to combine the (partially revealed) preferences over objects of a large population into a single, relatively consistent ordering of those objects. However, in many cases, we might not want a single ranking and instead opt for individual rankings. We study a version of the problem known as collaborative ranking. ...

## Multi angle  Bandits in auctions (& more) Perchet, Vianney (Auteur de la Conférence) | CIRM (Editeur )

In this talk, I will introduce the classical theory of multi-armed bandits, a field at the junction of statistics, optimization, game theory and machine learning, discuss the possible applications, and highlights the new perspectives and open questions that they propose We consider competitive capacity investment for a duopoly of two distinct producers. The producers are exposed to stochastically fluctuating costs and interact through aggregate supply. Capacity expansion is irreversible and modeled in terms of timing strategies characterized through threshold rules. Because the impact of changing costs on the producers is asymmetric, we are led to a nonzero-sum timing game describing the transitions among the discrete investment stages. Working in a continuous-time diffusion framework, we characterize and analyze the resulting Nash equilibrium and game values. Our analysis quantifies the dynamic competition effects and yields insight into dynamic preemption and over-investment in a general asymmetric setting. A case-study considering the impact of fluctuating emission costs on power producers investing in nuclear and coal-fired plants is also presented. In this talk, I will introduce the classical theory of multi-armed bandits, a field at the junction of statistics, optimization, game theory and machine learning, discuss the possible applications, and highlights the new perspectives and open questions that they propose We consider competitive capacity investment for a duopoly of two distinct producers. The producers are exposed to stochastically fluctuating costs and interact through aggregate ...

## Multi angle  Learning on the symmetric group Vert, Jean-Philippe (Auteur de la Conférence) | CIRM (Editeur )

Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance on biological applications and image classification. Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance ...

##### Codes MSC

Nuage de mots clefs ici

Z