En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK

Documents Roquain, Etienne 30 résultats

Filtrer
Sélectionner : Tous / Aucun
Q
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
2y
I shall classify current approaches to multiple inferences according to goals, and discuss the basic approaches being used. I shall then highlight a few challenges that await our attention : some are simple inequalities, others arise in particular applications.

62J15 ; 62P10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Free probability and random matrices - Biane, Philippe (Auteur de la Conférence) | CIRM H

Multi angle

I will explain how free probability, which is a theory of independence for non-commutative random variables, can be applied to understand the spectra of various models of random matrices.

15B52 ; 60B20 ; 46L53 ; 46L54

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Selective inference in genetics - Sabatti, Chiara (Auteur de la Conférence) | CIRM H

Multi angle

Geneticists have always been aware that, when looking for signal across the entire genome, one has to be very careful to avoid false discoveries. Contemporary studies often involve a very large number of traits, increasing the challenges of "looking every-where". I will discuss novel approaches that allow an adaptive exploration of the data, while guaranteeing reproducible results.

62F15 ; 62J15 ; 62P10 ; 92D10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Learning on the symmetric group - Vert, Jean-Philippe (Auteur de la Conférence) | CIRM H

Multi angle

Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance on biological applications and image classification.[-]
Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance ...[+]

62H30 ; 62P10 ; 68T05

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

High-dimensional classification by sparse logistic regression - Abramovich, Felix (Auteur de la Conférence) | CIRM H

Virtualconference

In this talk we consider high-dimensional classification. We discuss first high-dimensional binary classification by sparse logistic regression, propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. Implementation of any complexity penalty-based criterion, however, requires a combinatorial search over all possible models. To find a model selection procedure computationally feasible for high-dimensional data, we consider logistic Lasso and Slope classifiers and show that they also achieve the optimal rate. We extend further the proposed approach to multiclass classification by sparse multinomial logistic regression.

This is joint work with Vadim Grinshtein and Tomer Levy.[-]
In this talk we consider high-dimensional classification. We discuss first high-dimensional binary classification by sparse logistic regression, propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. Implementation of any complexity penalty-based criterion, however, requires a combinatorial ...[+]

62H30 ; 62C20

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We study the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on a Portmanteau test. Monte-Carlo experiments and numerical applications on illustrative examples are performed to highlight the obtained asymptotic results. Moreover, using a data-driven choice of the penalty, they show the practical efficiency of this new model selection procedure and Portemanteau test.[-]
We study the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the ...[+]

60K35

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
As a generalization of fill-in free property of a sparse positive definite real symmetric matrix with respect to the Cholesky decomposition, we introduce a notion of (quasi-)Cholesky structure for a real vector space of symmetric matrices. The cone of positive definite symmetric matrices in a vector space with a quasi-Cholesky structure admits explicit calculations and rich analysis similar to the ones for Gaussian selsction model associated to a decomposable graph. In particular, we can apply our method to a decomposable graphical model with a vertex pemutation symmetry.[-]
As a generalization of fill-in free property of a sparse positive definite real symmetric matrix with respect to the Cholesky decomposition, we introduce a notion of (quasi-)Cholesky structure for a real vector space of symmetric matrices. The cone of positive definite symmetric matrices in a vector space with a quasi-Cholesky structure admits explicit calculations and rich analysis similar to the ones for Gaussian selsction model associated to ...[+]

15B48 ; 62E15

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Floodgate: inference for model-free variable importance - Janson, Lucas (Auteur de la Conférence) | CIRM H

Virtualconference

Many modern applications seek to understand the relationship between an outcome variable of interest and a high-dimensional set of covariates. Often the first question asked is which covariates are important in this relationship, but the immediate next question, which in fact subsumes the first, is \emph{how} important each covariate is in this relationship. In parametric regression this question is answered through confidence intervals on the parameters. But without making substantial assumptions about the relationship between the outcome and the covariates, it is unclear even how to \emph{measure} variable importance, and for most sensible choices even less clear how to provide inference for it under reasonable conditions. In this paper we propose \emph{floodgate}, a novel method to provide asymptotic inference for a scalar measure of variable importance which we argue has universal appeal, while assuming nothing but moment bounds about the relationship between the outcome and the covariates. We take a model-X approach and thus assume the covariate distribution is known, but extend floodgate to the setting that only a \emph{model} for the covariate distribution is known and also quantify its robustness to violations of the modeling assumptions. We demonstrate floodgate's performance through extensive simulations and apply it to data from the UK Biobank to quantify the effects of genetic mutations on traits of interest.[-]
Many modern applications seek to understand the relationship between an outcome variable of interest and a high-dimensional set of covariates. Often the first question asked is which covariates are important in this relationship, but the immediate next question, which in fact subsumes the first, is \emph{how} important each covariate is in this relationship. In parametric regression this question is answered through confidence intervals on the ...[+]

62G15 ; 62G20

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Treatment effect estimation with missing attributes - Josse, Julie (Auteur de la Conférence) | CIRM H

Virtualconference

Inferring causal effects of a treatment or policy from observational data is central to many applications. However, state-of-the-art methods for causal inference suffer when covariates have missing values, which is ubiquitous in application.
Missing data greatly complicate causal analyses as they either require strong assumptions about the missing data generating mechanism or an adapted unconfoundedness hypothesis. In this talk, I will first provide a classification of existing methods according to the main underlying assumptions, which are based either on variants of the classical unconfoundedness assumption or relying on assumptions about the mechanism that generates the missing values. Then, I will present two recent contributions on this topic: (1) an extension of doubly robust estimators that allows handling of missing attributes, and (2) an approach to causal inference based on variational autoencoders adapted to incomplete data.
I will illustrate the topic an an observational medical database which has heterogeneous data and a multilevel structure to assess the impact of the administration of a treatment on survival.[-]
Inferring causal effects of a treatment or policy from observational data is central to many applications. However, state-of-the-art methods for causal inference suffer when covariates have missing values, which is ubiquitous in application.
Missing data greatly complicate causal analyses as they either require strong assumptions about the missing data generating mechanism or an adapted unconfoundedness hypothesis. In this talk, I will first ...[+]

62P10 ; 62H12 ; 62N99

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We consider the problem of estimating the mean vector of the multivariate complex normaldistribution with unknown covariance matrix under an invariant loss function when the samplesize is smaller than the dimension of the mean vector. Following the approach of Chételat and Wells (2012, Ann.Statist, p. 3137–3160), we show that a modification of Baranchik-tpye estimatorsbeats the MLE if it satisfies certain conditions. Based on this result, we propose the James-Stein-like shrinkage and its positive-part estimators.[-]
We consider the problem of estimating the mean vector of the multivariate complex normaldistribution with unknown covariance matrix under an invariant loss function when the samplesize is smaller than the dimension of the mean vector. Following the approach of Chételat and Wells (2012, Ann.Statist, p. 3137–3160), we show that a modification of Baranchik-tpye estimatorsbeats the MLE if it satisfies certain conditions. Based on this result, we ...[+]

62F10 ; 62C20 ; 62H12

Sélection Signaler une erreur