[КНИГА][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
A survey of online experiment design with the stochastic multi-armed bandit
Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …
survey and synthesize the work of the online statistical learning paradigm referred to as multi …
Recommendation system for adaptive learning
An adaptive learning system aims at providing instruction tailored to the current status of a
learner, differing from the traditional classroom experience. The latest advances in …
learner, differing from the traditional classroom experience. The latest advances in …
Reinforcement learning for sequential decision making in population research
N Deliu - Quality & Quantity, 2024 - Springer
Reinforcement learning (RL) algorithms have been long recognized as powerful tools for
optimal sequential decision making. The framework is concerned with a decision maker, the …
optimal sequential decision making. The framework is concerned with a decision maker, the …
The Gittins policy is nearly optimal in the M/G/k under extremely general conditions
The Gittins scheduling policy minimizes the mean response in the single-server M/G/1
queue in a wide variety of settings. Most famously, Gittins is optimal when preemption is …
queue in a wide variety of settings. Most famously, Gittins is optimal when preemption is …
[КНИГА][B] Multi-armed bandits: Theory and applications to online learning in networks
Q Zhao - 2019 - books.google.com
Multi-armed bandit problems pertain to optimal sequential decision making and learning in
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …
The assistive multi-armed bandit
Learning preferences implicit in the choices humans make is a well studied problem in both
economics and computer science. However, most work makes the assumption that humans …
economics and computer science. However, most work makes the assumption that humans …
A new toolbox for scheduling theory
Z Scully - ACM SIGMETRICS Performance Evaluation Review, 2023 - dl.acm.org
Queueing delays are ubiquitous in many domains, including computer systems, service
systems, communication networks, supply chains, and transportation. Queueing and …
systems, communication networks, supply chains, and transportation. Queueing and …
Conditions for indexability of restless bandits and an algorithm to compute Whittle index
Restless bandits are a class of sequential resource allocation problems concerned with
allocating one or more resources among several alternative processes where the evolution …
allocating one or more resources among several alternative processes where the evolution …
Multi-armed bandits with bounded arm-memory: Near-optimal guarantees for best-arm identification and regret minimization
Abstract We study the Stochastic Multi-armed Bandit problem under bounded arm-memory.
In this setting, the arms arrive in a stream, and the number of arms that can be stored in the …
In this setting, the arms arrive in a stream, and the number of arms that can be stored in the …