CIRM - Videos & books Library

Sélectionner : Tous / Aucun

Trier: P Q

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

This paper addresses the following question: “Can regression trees do what other machine learning methods cannot?” To answer this question, we consider the problem of estimating regression functions with spatial inhomogeneities. Many real life applications involve functions that exhibit a variety of shapes including jump discontinuities or high-frequency oscillations. Unfortunately, the overwhelming majority of existing asymptotic minimaxity theory (for density or regression function estimation) is predicated on homogeneous smoothness assumptions which are inadequate for such data. Focusing on locally Holder functions, we provide locally adaptive posterior concentration rate results under the supremum loss. These results certify that trees can adapt to local smoothness by uniformly achieving the point-wise (near) minimax rate. Such results were previously unavailable for regression trees (forests). Going further, we construct locally adaptive credible bands whose width depends on local smoothness and which achieve uniform coverage under local self-similarity. Unlike many other machine learning methods, Bayesian regression trees thus provide valid uncertainty quantification. To highlight the benefits of trees, we show that Gaussian processes cannot adapt to local smoothness by showing lower bound results under a global estimation loss. Bayesian regression trees are thus uniquely suited for estimation and uncertainty quantification of spatially inhomogeneous functions.[-]

62G20 ; 62G15

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

Many modern applications seek to understand the relationship between an outcome variable of interest and a high-dimensional set of covariates. Often the first question asked is which covariates are important in this relationship, but the immediate next question, which in fact subsumes the first, is \emph{how} important each covariate is in this relationship. In parametric regression this question is answered through confidence intervals on the parameters. But without making substantial assumptions about the relationship between the outcome and the covariates, it is unclear even how to \emph{measure} variable importance, and for most sensible choices even less clear how to provide inference for it under reasonable conditions. In this paper we propose \emph{floodgate}, a novel method to provide asymptotic inference for a scalar measure of variable importance which we argue has universal appeal, while assuming nothing but moment bounds about the relationship between the outcome and the covariates. We take a model-X approach and thus assume the covariate distribution is known, but extend floodgate to the setting that only a \emph{model} for the covariate distribution is known and also quantify its robustness to violations of the modeling assumptions. We demonstrate floodgate's performance through extensive simulations and apply it to data from the UK Biobank to quantify the effects of genetic mutations on traits of interest.[-]

62G15 ; 62G20

Sélection Signaler une erreur

TUTELLES

PARTENAIRES

Destination de la recherche

Raccourcis

Documents 62G15 2 résultats

Bayesian spatial adaptation - Rockova, Veronika (Auteur de la Conférence) | CIRM H Nouveau

Floodgate: inference for model-free variable importance - Janson, Lucas (Auteur de la Conférence) | CIRM H Nouveau