CIRM - Videos & books Library

Sélectionner : Tous / Aucun

Trier: P Q

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

The advent of large scale inference has spurred reexamination of conventional statistical thinking. In a series of highly original articles, Efron showed in some examples that the ensemble of the null distributed test statistics grossly deviated from the theoretical null distribution, and Efron persuasively illustrated the danger in assuming the theoretical null's veracity for downstream inference. Though intimidating in other contexts, the large scale setting is to the statistician's benefit here. There is now potential to estimate, rather than assume, the null distribution.
In a model for n many z-scores with at most k nonnulls, we adopt Efron's suggestion and consider estimation of location and scale parameters for a Gaussian null distribution. Placing no assumptions on the nonnull effects, we consider rate-optimal estimation in the entire regime k < n/2, that is, precisely the regime in which the null parameters are identifiable. The minimax upper bound is obtained by considering estimators based on the empirical characteristic function and the classical kernel mode estimator. Faster rates than those in Huber's contamination model are achievable by exploiting the Gaussian character of the data. As a consequence, it is shown that consistent estimation is indeed possible in the practically relevant regime k ≍ n. In a certain regime, the minimax lower bound involves constructing two marginal distributions whose characteristic functions match on a wide interval containing zero. The construction notably differs from those in the literature by sharply capturing a second-order scaling of n/2 − k in the minimax rate.[-]

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

In this short course, we will discuss the problem of ranking with partially observed pairwise comparisons in the setting of Bradley-Terry-Luce (BTL) model. There are two fundamental problems: 1) top-K ranking, which is to select the set of K players with top performances; 2) total ranking, which is to rank the entire set of players. Both ranking problems find important applications in web search, recommender systems and sports competition.
In the first presentation, we will consider the top-K ranking problem. The statistical properties of two popular algorithms, MLE and rank centrality (spectral ranking) will be precisely characterized. In terms of both partial and exact recovery, the MLE achieves optimality with matching lower bounds. The spectral method is shown to be generally sub-optimal, though it has the same order of sample complexity as the MLE. Our theory also reveals the essentially only situation when the spectral method is optimal. This turns out to be the most favorable choice of skill parameter given the separation of the two groups.
The second presentation will be focused on total ranking. The problem is to find a permutation vector to rank the entire set of players. We will show that the minimax rate of the problem with respect to the Kendall's tau loss exhibits a transition between an exponential rate and a polynomial rate depending on the signal to noise ratio of the problem. The optimal algorithm consists of two stages. In the first stage, games with very high or low scores are used to partition the entire set of players into different leagues. In the second stage, games that are very close are used to rank the players within each league. We will give intuition and some analysis to show why the algorithm works optimally.[-]

62C20 ; 62F07 ; 62J12

Sélection Signaler une erreur

Déposez votre fichier ici pour le déplacer vers cet enregistrement.

62C20 ; 62F07 ; 62J12

Sélection Signaler une erreur

TUTELLES

PARTENAIRES

Destination de la recherche

Raccourcis

Documents Gao, Chao 3 résultats

Minimax estimation in Efron's two-groups model - Gao, Chao (Auteur de la Conférence) | CIRM H Nouveau

Statistical optimality and algorithms for top-K and total ranking - Lecture 1 - Gao, Chao (Auteur de la Conférence) | CIRM H Nouveau

Statistical optimality and algorithms for top-K and total ranking - Lecture 2 - Gao, Chao (Auteur de la Conférence) | CIRM H Nouveau