Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Federated multi-armed bandits
Federated multi-armed bandits (FMAB) is a new bandit paradigm that parallels the federated
learning (FL) framework in supervised learning. It is inspired by practical applications in …
learning (FL) framework in supervised learning. It is inspired by practical applications in …
Bandit learning in decentralized matching markets
We study two-sided matching markets in which one side of the market (the players) does not
have a priori knowledge about its preferences for the other side (the arms) and is required to …
have a priori knowledge about its preferences for the other side (the arms) and is required to …
SIC-MMAB: Synchronisation involves communication in multiplayer multi-armed bandits
Motivated by cognitive radio networks, we consider the stochastic multiplayer multi-armed
bandit problem, where several players pull arms simultaneously and collisions occur if one …
bandit problem, where several players pull arms simultaneously and collisions occur if one …
Cooperative stochastic bandits with asynchronous agents and constrained feedback
This paper studies a cooperative multi-armed bandit problem with $ M $ agents cooperating
together to solve the same instance of a $ K $-armed stochastic bandit problem with the goal …
together to solve the same instance of a $ K $-armed stochastic bandit problem with the goal …
Heterogeneous multi-player multi-armed bandits: Closing the gap and generalization
Despite the significant interests and many progresses in decentralized multi-player multi-
armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized …
armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized …
Regret, stability & fairness in matching markets with bandit learners
Making an informed decision—for example, when choosing a career or housing—requires
knowledge about the available options. Such knowledge is generally acquired through …
knowledge about the available options. Such knowledge is generally acquired through …
Cooperative multi-agent bandits with heavy tails
A Dubey - International conference on machine learning, 2020 - proceedings.mlr.press
We study the heavy-tailed stochastic bandit problem in the cooperative multi-agent setting,
where a group of agents interact with a common bandit problem, while communicating on a …
where a group of agents interact with a common bandit problem, while communicating on a …
Multiplayer bandits without observing collision information
We study multiplayer stochastic multiarmed bandit problems in which the players cannot
communicate, and if two or more players pull the same arm, a collision occurs and the …
communicate, and if two or more players pull the same arm, a collision occurs and the …
[BOOK][B] Multi-armed bandits: Theory and applications to online learning in networks
Q Zhao - 2019 - books.google.com
Multi-armed bandit problems pertain to optimal sequential decision making and learning in
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …
unknown environments. Since the first bandit problem posed by Thompson in 1933 for the …