Review on ranking and selection: A new perspective
In this paper, we briefly review the development of ranking and selection (R&S) in the past
70 years, especially the theoretical achievements and practical applications in the past 20 …
70 years, especially the theoretical achievements and practical applications in the past 20 …
Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting
This paper is concerned with identifying the arm with the highest mean in a multi-armed
bandit problem using as few independent samples from the arms as possible. While the so …
bandit problem using as few independent samples from the arms as possible. While the so …
Text-to-image diffusion models are zero shot classifiers
The excellent generative capabilities of text-to-image diffusion models suggest they learn
informative representations of image-text data. However, what knowledge their …
informative representations of image-text data. However, what knowledge their …
[BUKU][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Regret analysis of stochastic and nonstochastic multi-armed bandit problems
Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …
with an exploration-exploitation trade-off. This is the balance between staying with the option …
[BUKU][B] Algorithms for reinforcement learning
C Szepesvári - 2022 - books.google.com
Reinforcement learning is a learning paradigm concerned with learning to control a system
so as to maximize a numerical performance measure that expresses a long-term objective …
so as to maximize a numerical performance measure that expresses a long-term objective …
Almost optimal exploration in multi-armed bandits
We study the problem of exploration in stochastic Multi-Armed Bandits. Even in the simplest
setting of identifying the best arm, there remains a logarithmic multiplicative gap between the …
setting of identifying the best arm, there remains a logarithmic multiplicative gap between the …
lil'ucb: An optimal exploration algorithm for multi-armed bandits
The paper proposes a novel upper confidence bound (UCB) procedure for identifying the
arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using …
arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using …
Simple bayesian algorithms for best arm identification
D Russo - Conference on Learning Theory, 2016 - proceedings.mlr.press
This paper considers the optimal adaptive allocation of measurement effort for identifying the
best among a finite set of options or designs. An experimenter sequentially chooses designs …
best among a finite set of options or designs. An experimenter sequentially chooses designs …