Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Explore, exploit, and explain: personalizing explainable recommendations with bandits

J McInerney, B Lacker, S Hansen, K Higley… - Proceedings of the 12th …, 2018 - dl.acm.org
The multi-armed bandit is an important framework for balancing exploration with exploitation
in recommendation. Exploitation recommends content (eg, products, movies, music playlists) …

Combinatorial slee** bandits with fairness constraints

F Li, J Liu, B Ji - IEEE Transactions on Network Science and …, 2019 - ieeexplore.ieee.org
The multi-armed bandit (MAB) model has been widely adopted for studying many practical
optimization problems (network resource allocation, ad placement, crowdsourcing, etc.) with …

Tight regret bounds for stochastic combinatorial semi-bandits

B Kveton, Z Wen, A Ashkan… - Artificial Intelligence …, 2015 - proceedings.mlr.press
A stochastic combinatorial semi-bandit is an online learning problem where at each step a
learning agent chooses a subset of ground items subject to constraints, and then observes …

Cascading bandits: Learning to rank in the cascade model

B Kveton, C Szepesvari, Z Wen… - … conference on machine …, 2015 - proceedings.mlr.press
A search engine usually outputs a list of K web pages. The user examines this list, from the
first web page to the last, and chooses the first attractive page. This model of user behavior …

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

W Chen, Y Wang, Y Yuan, Q Wang - Journal of Machine Learning …, 2016 - jmlr.org
In the past few years, differential privacy has become a standard concept in the area of
privacy. One of the most important problems in this field is to answer queries while …

Combinatorial pure exploration of multi-armed bandits

S Chen, T Lin, I King, MR Lyu… - Advances in neural …, 2014 - proceedings.neurips.cc
We study the {\em combinatorial pure exploration (CPE)} problem in the stochastic multi-
armed bandit setting, where a learner explores a set of arms with the objective of identifying …

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press
We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

Combinatorial bandits revisited

R Combes… - Advances in neural …, 2015 - proceedings.neurips.cc
This paper investigates stochastic and adversarial combinatorial multi-armed bandit
problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific …

Combinatorial multi-armed bandit with general reward functions

W Chen, W Hu, F Li, J Li, Y Liu… - Advances in Neural …, 2016 - proceedings.neurips.cc
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …