- Academic Search

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Enregistrer Citer Cité 208 fois Autres articles Les 6 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] tor-lattimore.com

[LIVRE][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Enregistrer Citer Cité 3303 fois Autres articles Les 9 versions Free GPT-4 DeepSeek Recherche dans les bibliothèques

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Enregistrer Citer Cité 1259 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Recherche dans les bibliothèques Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Optimal best-arm identification in linear bandits

Y Jedra, A Proutiere - Advances in Neural Information …, 2020 - proceedings.neurips.cc

We study the problem of best-arm identification with fixed confidence in stochastic linear
bandits. The objective is to identify the best arm with a given level of certainty while …

Enregistrer Citer Cité 98 fois Autres articles Les 6 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Mixture martingales revisited with applications to sequential tests and confidence intervals

E Kaufmann, WM Koolen - Journal of Machine Learning Research, 2021 - jmlr.org

This paper presents new deviation inequalities that are valid uniformly in time under
adaptive sampling in a multi-armed bandit model. The deviations are measured using the …

Enregistrer Citer Cité 143 fois Autres articles Les 12 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

High-dimensional sparse linear bandits

B Hao, T Lattimore, M Wang - Advances in Neural …, 2020 - proceedings.neurips.cc

Stochastic linear bandits with high-dimensional sparse features are a practical model for a
variety of domains, such as personalized medicine and online advertising. We derive a …

Enregistrer Citer Cité 69 fois Autres articles Les 9 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fast pure exploration via frank-wolfe

PA Wang, RC Tzeng… - Advances in Neural …, 2021 - proceedings.neurips.cc

We study the problem of active pure exploration with fixed confidence in generic stochastic
bandit environments. The goal of the learner is to answer a query about the environment …

Enregistrer Citer Cité 51 fois Autres articles Les 9 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously

CW Lee, H Luo, CY Wei, M Zhang… - … on Machine Learning, 2021 - proceedings.mlr.press

In this work, we develop linear bandit algorithms that automatically adapt to different
environments. By plugging a novel loss estimator into the optimization problem that …

Enregistrer Citer Cité 53 fois Autres articles Les 5 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Approximate allocation matching for structural causal bandits with unobserved confounders

L Wei, MQ Elahi, M Ghasemi… - Advances in Neural …, 2024 - proceedings.neurips.cc

Structural causal bandit provides a framework for online decision-making problems when
causal information is available. It models the stochastic environment with a structural causal …

Enregistrer Citer Cité 5 fois Autres articles Les 5 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Best arm identification with fixed budget: A large deviation perspective

PA Wang, RC Tzeng… - Advances in Neural …, 2023 - proceedings.neurips.cc

We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits
(MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error …

Enregistrer Citer Cité 9 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Minimal exploration in structured stochastic bandits

The statistical complexity of interactive decision making

[LIVRE][B] Bandit algorithms

Introduction to multi-armed bandits

Optimal best-arm identification in linear bandits

Mixture martingales revisited with applications to sequential tests and confidence intervals

High-dimensional sparse linear bandits

Fast pure exploration via frank-wolfe

Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously

Approximate allocation matching for structural causal bandits with unobserved confounders

Best arm identification with fixed budget: A large deviation perspective