Review on ranking and selection: A new perspective
In this paper, we briefly review the development of ranking and selection (R&S) in the past
70 years, especially the theoretical achievements and practical applications in the past 20 …
70 years, especially the theoretical achievements and practical applications in the past 20 …
Hyperband: A novel bandit-based approach to hyperparameter optimization
Performance of machine learning algorithms depends critically on identifying a good set of
hyperparameters. While recent approaches use Bayesian optimization to adaptively select …
hyperparameters. While recent approaches use Bayesian optimization to adaptively select …
Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting
This paper is concerned with identifying the arm with the highest mean in a multi-armed
bandit problem using as few independent samples from the arms as possible. While the so …
bandit problem using as few independent samples from the arms as possible. While the so …
Non-stochastic best arm identification and hyperparameter optimization
Motivated by the task of hyperparameter optimization, we introduce the\em non-stochastic
best-arm identification problem. We identify an attractive algorithm for this setting that makes …
best-arm identification problem. We identify an attractive algorithm for this setting that makes …
[PDF][PDF] On the complexity of best-arm identification in multi-armed bandit models
The stochastic multi-armed bandit model is a simple abstraction that has proven useful in
many different contexts in statistics and machine learning. Whereas the achievable limit in …
many different contexts in statistics and machine learning. Whereas the achievable limit in …
Game-theoretic statistics and safe anytime-valid inference
Safe anytime-valid inference (SAVI) provides measures of statistical evidence and certainty—
e-processes for testing and confidence sequences for estimation—that remain valid at all …
e-processes for testing and confidence sequences for estimation—that remain valid at all …
Time-uniform, nonparametric, nonasymptotic confidence sequences
Time-uniform, nonparametric, nonasymptotic confidence sequences Page 1 The Annals of
Statistics 2021, Vol. 49, No. 2, 1055–1080 https://doi.org/10.1214/20-AOS1991 © Institute of …
Statistics 2021, Vol. 49, No. 2, 1055–1080 https://doi.org/10.1214/20-AOS1991 © Institute of …
Optimal best arm identification with fixed confidence
We give a complete characterization of the complexity of best-arm identification in one-
parameter bandit problems. We prove a new, tight lower bound on the sample complexity …
parameter bandit problems. We prove a new, tight lower bound on the sample complexity …
Top two algorithms revisited
Top two algorithms arose as an adaptation of Thompson sampling to best arm identification
in multi-armed bandit models for parametric families of arms. They select the next arm to …
in multi-armed bandit models for parametric families of arms. They select the next arm to …
Anytime-valid off-policy inference for contextual bandits
Contextual bandit algorithms are ubiquitous tools for active sequential experimentation in
healthcare and the tech industry. They involve online learning algorithms that adaptively …
healthcare and the tech industry. They involve online learning algorithms that adaptively …