Академия Google

[КНИГА][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Сохранить Цитировать Цитируется: 3297 Похожие статьи Все версии статьи (9) Поиск в библиотеках

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of online experiment design with the stochastic multi-armed bandit

G Burtini, J Loeppky, R Lawrence - arxiv preprint arxiv:1510.00757, 2015 - arxiv.org

Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …

Сохранить Цитировать Цитируется: 165 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] sagepub.com

Recommendation system for adaptive learning

Y Chen, X Li, J Liu, Z Ying - Applied psychological …, 2018 - journals.sagepub.com

An adaptive learning system aims at providing instruction tailored to the current status of a
learner, differing from the traditional classroom experience. The latest advances in …

Сохранить Цитировать Цитируется: 134 Похожие статьи Все версии статьи (11)

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Reinforcement learning for sequential decision making in population research

N Deliu - Quality & Quantity, 2024 - Springer

Reinforcement learning (RL) algorithms have been long recognized as powerful tools for
optimal sequential decision making. The framework is concerned with a decision maker, the …

Сохранить Цитировать Цитируется: 11 Похожие статьи Все версии статьи (4)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

The Gittins policy is nearly optimal in the M/G/k under extremely general conditions

Z Scully, I Grosof, M Harchol-Balter - … of the ACM on Measurement and …, 2020 - dl.acm.org

The Gittins scheduling policy minimizes the mean response in the single-server M/G/1
queue in a wide variety of settings. Most famously, Gittins is optimal when preemption is …

Сохранить Цитировать Цитируется: 37 Похожие статьи Все версии статьи (5)

[КНИГА][B] Multi-armed bandits: Theory and applications to online learning in networks

Q Zhao - 2019 - books.google.com

Multi-armed bandit problems pertain to optimal sequential decision making and learning in
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …

Сохранить Цитировать Цитируется: 48 Похожие статьи Все версии статьи (4) Поиск в библиотеках

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The assistive multi-armed bandit

L Chan, D Hadfield-Menell, S Srinivasa… - 2019 14th ACM/IEEE …, 2019 - ieeexplore.ieee.org

Learning preferences implicit in the choices humans make is a well studied problem in both
economics and computer science. However, most work makes the assumption that humans …

Сохранить Цитировать Цитируется: 55 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] cmu.edu

A new toolbox for scheduling theory

Z Scully - ACM SIGMETRICS Performance Evaluation Review, 2023 - dl.acm.org

Queueing delays are ubiquitous in many domains, including computer systems, service
systems, communication networks, supply chains, and transportation. Queueing and …

Сохранить Цитировать Цитируется: 14 Похожие статьи Все версии статьи (11)

[Free GPT-4]
[DeepSeek]

[PDF] mcgill.ca

Conditions for indexability of restless bandits and an algorithm to compute Whittle index

N Akbarzadeh, A Mahajan - Advances in Applied Probability, 2022 - cambridge.org

Restless bandits are a class of sequential resource allocation problems concerned with
allocating one or more resources among several alternative processes where the evolution …

Сохранить Цитировать Цитируется: 27 Похожие статьи Все версии статьи (8)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Multi-armed bandits with bounded arm-memory: Near-optimal guarantees for best-arm identification and regret minimization

A Maiti, V Patil, A Khan - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Abstract We study the Stochastic Multi-armed Bandit problem under bounded arm-memory.
In this setting, the arms arrive in a stream, and the number of arms that can be stored in the …

Сохранить Цитировать Цитируется: 17 Похожие статьи Все версии статьи (7) В виде HTML

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

[КНИГА][B] Bandit algorithms

A survey of online experiment design with the stochastic multi-armed bandit

Recommendation system for adaptive learning

Reinforcement learning for sequential decision making in population research

The Gittins policy is nearly optimal in the M/G/k under extremely general conditions

[КНИГА][B] Multi-armed bandits: Theory and applications to online learning in networks

The assistive multi-armed bandit

A new toolbox for scheduling theory

Conditions for indexability of restless bandits and an algorithm to compute Whittle index

Multi-armed bandits with bounded arm-memory: Near-optimal guarantees for best-arm identification and regret minimization