- Academic Search

[BOK][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Spara Citera Citerat av 3287 Relaterade artiklar Alla 9 versionerna Bibliotekssökning

[Free GPT-4]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Spara Citera Citerat av 1253 Relaterade artiklar Alla 7 versionerna Bibliotekssökning Se som HTML-version

[Free GPT-4]

[PDF] nature.com

Efficient and targeted COVID-19 border testing via reinforcement learning

H Bastani, K Drakopoulos, V Gupta, I Vlachogiannis… - Nature, 2021 - nature.com

Throughout the coronavirus disease 2019 (COVID-19) pandemic, countries have relied on a
variety of ad hoc border control protocols to allow for non-essential travel while safeguarding …

Spara Citera Citerat av 127 Relaterade artiklar Alla 11 versionerna

[Free GPT-4]

[PDF] arxiv.org

Exploration-exploitation in constrained mdps

Y Efroni, S Mannor, M Pirotta - arxiv preprint arxiv:2003.02189, 2020 - arxiv.org

In many sequential decision-making problems, the goal is to optimize a utility function while
satisfying a set of constraints on different utilities. This learning problem is formalized …

Spara Citera Citerat av 178 Relaterade artiklar Alla 2 versionerna Se som HTML-version

[Free GPT-4]

[PDF] acm.org

Bandits with knapsacks

A Badanidiyuru, R Kleinberg, A Slivkins - Journal of the ACM (JACM), 2018 - dl.acm.org

Multi-armed bandit problems are the predominant theoretical model of exploration-
exploitation tradeoffs in learning, and they have countless applications ranging from medical …

Spara Citera Citerat av 536 Relaterade artiklar Alla 11 versionerna

[Free GPT-4]

[PDF] ssrn.com

Feature-based dynamic pricing

MC Cohen, I Lobel, R Paes Leme - Management Science, 2020 - pubsonline.informs.org

We consider the problem faced by a firm that receives highly differentiated products in an
online fashion. The firm needs to price these products to sell them to its customer base …

Spara Citera Citerat av 246 Relaterade artiklar Alla 17 versionerna

[Free GPT-4]

[PDF] neurips.cc

A unifying framework for online optimization with long-term constraints

M Castiglioni, A Celli, A Marchesi… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study online learning problems in which a decision maker has to take a sequence of
decisions subject to $ m $ long-term constraints. The goal of the decision maker is to …

Spara Citera Citerat av 32 Relaterade artiklar Alla 8 versionerna Se som HTML-version

[Free GPT-4]

[PDF] arxiv.org

Meta dynamic pricing: Transfer learning across experiments

H Bastani, D Simchi-Levi, R Zhu - Management Science, 2022 - pubsonline.informs.org

We study the problem of learning shared structure across a sequence of dynamic pricing
experiments for related products. We consider a practical formulation in which the unknown …

Spara Citera Citerat av 140 Relaterade artiklar Alla 9 versionerna Bibliotekssökning

[Free GPT-4]

[PDF] arxiv.org

Adversarial bandits with knapsacks

N Immorlica, K Sankararaman, R Schapire… - Journal of the ACM, 2022 - dl.acm.org

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …

Spara Citera Citerat av 134 Relaterade artiklar Alla 12 versionerna

[Free GPT-4]

[PDF] siam.org

Fast algorithms for online stochastic convex programming

S Agrawal, NR Devanur - Proceedings of the twenty-sixth annual ACM-SIAM …, 2014 - SIAM

We introduce the online stochastic Convex Programming (CP) problem, a very general
version of stochastic online problems which allows arbitrary concave objectives and convex …

Spara Citera Citerat av 205 Relaterade artiklar Alla 8 versionerna

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

[BOK][B] Bandit algorithms

Introduction to multi-armed bandits

Efficient and targeted COVID-19 border testing via reinforcement learning

Exploration-exploitation in constrained mdps

Bandits with knapsacks

Feature-based dynamic pricing

A unifying framework for online optimization with long-term constraints

Meta dynamic pricing: Transfer learning across experiments

Adversarial bandits with knapsacks

Fast algorithms for online stochastic convex programming