Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g. sorting, picking closest neighbors, finding shortest paths or optimal matchings). Although these discrete decisions are easily computed in a forward manner, they cannot be used to modify model parameters using first-order optimization techniques because they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to transform a block that outputs an optimal discrete decision into a differentiable operation. Our approach relies on stochastic perturbations of these parameters, and can be used readily within existing solvers without the need for ad hoc regularization or smoothing. These perturbed optimizers yield solutions that are differentiable and never locally constant. The amount of smoothness can be tuned via the chosen noise amplitude, whose impact we analyze. The derivatives of these perturbed solvers can be evaluated eciently. We also show how this framework can be connected to a family of losses developed in structured prediction, and describe how these can be used in unsupervised and supervised learning, with theoretical guarantees.
We demonstrate the performance of our approach on several machine learning tasks in experiments on synthetic and real data.
[-]
Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g. sorting, picking closest neighbors, finding shortest paths or optimal matchings). Although these discrete decisions are easily computed in a forward manner, they cannot be used to modify model parameters using first-order optimization techniques because they break the back-propagation of computational graphs. In order to expand the scope of learning ...
[+]
90C06 ; 68W20 ; 62F99
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
The class of integer-valued trawl processes has recently been introduced for modelling univariate and multivariate integer-valued time series with short or long memory.
In this talk, I will discuss recent developments with regards to model estimation, model selection and forecasting of such processes. The new methods will be illustrated in an empirical study of high-frequency financial data.
This is joint work with Mikkel Bennedsen (Aarhus University), Asger Lunde (Aarhus University) and Neil Shephard (Harvard University).
[-]
The class of integer-valued trawl processes has recently been introduced for modelling univariate and multivariate integer-valued time series with short or long memory.
In this talk, I will discuss recent developments with regards to model estimation, model selection and forecasting of such processes. The new methods will be illustrated in an empirical study of high-frequency financial data.
This is joint work with Mikkel Bennedsen (Aarhus ...
[+]
37M10 ; 60G10 ; 60G55 ; 62F99 ; 62M10 ; 62P05
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
Modern machine learning architectures often embed their inputs into a lower-dimensional latent space before generating a final output. A vast set of empirical results---and some emerging theory---predicts that these lower-dimensional codes often are highly structured, capturing lower-dimensional variation in the data. Based on this observation, in this talk I will describe efforts in my group to develop lightweight algorithms that navigate, restructure, and reshape learned latent spaces. Along the way, I will consider a variety of practical problems in machine learning, including low-rank adaptation of large models, regularization to promote local latent structure, and efficient training/evaluation of generative models.
[-]
Modern machine learning architectures often embed their inputs into a lower-dimensional latent space before generating a final output. A vast set of empirical results---and some emerging theory---predicts that these lower-dimensional codes often are highly structured, capturing lower-dimensional variation in the data. Based on this observation, in this talk I will describe efforts in my group to develop lightweight algorithms that navigate, ...
[+]
62E20 ; 62F99 ; 62G07 ; 62P30 ; 65C50 ; 68T99