En poursuivant votre navigation sur ce site, vous acceptez l'utilisation d'un simple cookie d'identification. Aucune autre exploitation n'est faite de ce cookie. OK

Documents Roquain, Etienne 30 résultats

Filtrer
Sélectionner : Tous / Aucun
Q
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

High-dimensional classification by sparse logistic regression - Abramovich, Felix (Auteur de la Conférence) | CIRM H

Virtualconference

In this talk we consider high-dimensional classification. We discuss first high-dimensional binary classification by sparse logistic regression, propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. Implementation of any complexity penalty-based criterion, however, requires a combinatorial search over all possible models. To find a model selection procedure computationally feasible for high-dimensional data, we consider logistic Lasso and Slope classifiers and show that they also achieve the optimal rate. We extend further the proposed approach to multiclass classification by sparse multinomial logistic regression.

This is joint work with Vadim Grinshtein and Tomer Levy.[-]
In this talk we consider high-dimensional classification. We discuss first high-dimensional binary classification by sparse logistic regression, propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. Implementation of any complexity penalty-based criterion, however, requires a combinatorial ...[+]

62H30 ; 62C20

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We consider the problem of estimating the mean vector of the multivariate complex normaldistribution with unknown covariance matrix under an invariant loss function when the samplesize is smaller than the dimension of the mean vector. Following the approach of Chételat and Wells (2012, Ann.Statist, p. 3137–3160), we show that a modification of Baranchik-tpye estimatorsbeats the MLE if it satisfies certain conditions. Based on this result, we propose the James-Stein-like shrinkage and its positive-part estimators.[-]
We consider the problem of estimating the mean vector of the multivariate complex normaldistribution with unknown covariance matrix under an invariant loss function when the samplesize is smaller than the dimension of the mean vector. Following the approach of Chételat and Wells (2012, Ann.Statist, p. 3137–3160), we show that a modification of Baranchik-tpye estimatorsbeats the MLE if it satisfies certain conditions. Based on this result, we ...[+]

62F10 ; 62C20 ; 62H12

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
2y
I shall classify current approaches to multiple inferences according to goals, and discuss the basic approaches being used. I shall then highlight a few challenges that await our attention : some are simple inequalities, others arise in particular applications.

62J15 ; 62P10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Free probability and random matrices - Biane, Philippe (Auteur de la Conférence) | CIRM H

Multi angle

I will explain how free probability, which is a theory of independence for non-commutative random variables, can be applied to understand the spectra of various models of random matrices.

15B52 ; 60B20 ; 46L53 ; 46L54

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
Les processus de Hawkes forment une classe des processus ponctuels pour lesquels l'intensité s'écrit comme :

$\lambda(t)= \int_{0}^{t^-} h(t-s)dN_s +\nu$

où $N$ représente le processus de Hawkes, et $\nu > 0$. Les processus de Hawkes multivariés ont une intensité similaire sauf que des interractions entre les différentes composantes du processus de Hawkes sont autorisées. Les paramètres de ce modèle sont donc les fonctions d'interractions $h_{k,\ell}, k, \ell \le M$ et les constantes $\nu_\ell, \ell \le M$. Dans ce travail nous étudions une approche bayésienne nonparamétrique pour estimer les fonctions $h_{k,\ell}$ et les constantes $\nu_\ell$. Nous présentons un théorème général caractérisant la vitesse de concentration de la loi a posteriori dans de tels modèles. L'intérêt de cette approche est qu'elle permet la caractérisation de la convergence en norme $L_1$ et demande assez peu d'hypothèses sur la forme de la loi a priori. Une caractérisation de la convergence en norme $L_2$ est aussi considérée. Nous étudierons un exemple de lois a priori adaptées à l'étude des interractions neuronales. Travail en collaboration avec S. Donnet et V. Rivoirard.[-]
Les processus de Hawkes forment une classe des processus ponctuels pour lesquels l'intensité s'écrit comme :

$\lambda(t)= \int_{0}^{t^-} h(t-s)dN_s +\nu$

où $N$ représente le processus de Hawkes, et $\nu > 0$. Les processus de Hawkes multivariés ont une intensité similaire sauf que des interractions entre les différentes composantes du processus de Hawkes sont autorisées. Les paramètres de ce modèle sont donc les fonctions d'interractions ...[+]

62Gxx ; 62G05 ; 62F15 ; 62G20

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Selective inference in genetics - Sabatti, Chiara (Auteur de la Conférence) | CIRM H

Multi angle

Geneticists have always been aware that, when looking for signal across the entire genome, one has to be very careful to avoid false discoveries. Contemporary studies often involve a very large number of traits, increasing the challenges of "looking every-where". I will discuss novel approaches that allow an adaptive exploration of the data, while guaranteeing reproducible results.

62F15 ; 62J15 ; 62P10 ; 92D10

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y

Learning on the symmetric group - Vert, Jean-Philippe (Auteur de la Conférence) | CIRM H

Multi angle

Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance on biological applications and image classification.[-]
Many data can be represented as rankings or permutations, raising the question of developing machine learning models on the symmetric group. When the number of items in the permutations gets large, manipulating permutations can quickly become computationally intractable. I will discuss two computationally efficient embeddings of the symmetric groups in Euclidean spaces leading to fast machine learning algorithms, and illustrate their relevance ...[+]

62H30 ; 62P10 ; 68T05

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
We study the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on a Portmanteau test. Monte-Carlo experiments and numerical applications on illustrative examples are performed to highlight the obtained asymptotic results. Moreover, using a data-driven choice of the penalty, they show the practical efficiency of this new model selection procedure and Portemanteau test.[-]
We study the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the ...[+]

60K35

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
There is an emerging consensus in the transdiciplinary literature that the ultimate goal of regression analysis is to model the conditional distribution of an outcome, given a set of explanatory variables or covariates. This new approach is called "distributional regression", and marks a clear break from the classical view of regression, which has focused on estimating a conditional mean or quantile only. Isotonic Distributional Regression (IDR) learns conditional distributions that are simultaneously optimal relative to comprehensive classes of relevant loss functions, subject to monotonicity constraints in terms of a partial order on the covariate space. This IDR solution is exactly computable and does not require approximations nor implementation choices, except for the selection of the partial order. Despite being an entirely generic technique, IDR is strongly competitive with state-of-the-art methods in a case study on probabilistic precipitation forecasts from a leading numerical weather prediction model.

Joint work with Alexander Henzi and Johanna F. Ziegel.[-]
There is an emerging consensus in the transdiciplinary literature that the ultimate goal of regression analysis is to model the conditional distribution of an outcome, given a set of explanatory variables or covariates. This new approach is called "distributional regression", and marks a clear break from the classical view of regression, which has focused on estimating a conditional mean or quantile only. Isotonic Distributional Regression (IDR) ...[+]

62J02 ; 68T09

Sélection Signaler une erreur
Déposez votre fichier ici pour le déplacer vers cet enregistrement.
y
The highly influential two-group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), i.e., the probability of the hypothesis being null given the set of test statistics, with a fixed threshold.
We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) rather than mFDR in the general two-group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation.
We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies. [-]
The highly influential two-group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local ...[+]

62F03 ; 62J15 ; 62P10

Sélection Signaler une erreur