Online learning: A comprehensive survey

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

[หนังสือ][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Ranger21: a synergistic deep learning optimizer

L Wright, N Demeure - arxiv preprint arxiv:2106.13731, 2021 - arxiv.org
As optimizers are critical to the performances of neural networks, every year a large number
of papers innovating on the subject are published. However, while most of these …

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

S Bubeck, N Cesa-Bianchi - Foundations and Trends® in …, 2012 - nowpublishers.com
Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …

Tight regret bounds for stochastic combinatorial semi-bandits

B Kveton, Z Wen, A Ashkan… - Artificial Intelligence …, 2015 - proceedings.mlr.press
A stochastic combinatorial semi-bandit is an online learning problem where at each step a
learning agent chooses a subset of ground items subject to constraints, and then observes …

Off-policy evaluation for slate recommendation

A Swaminathan, A Krishnamurthy… - Advances in …, 2017 - proceedings.neurips.cc
This paper studies the evaluation of policies that recommend an ordered set of items (eg, a
ranking) based on some context---a common scenario in web search, ads, and …

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

W Chen, Y Wang, Y Yuan, Q Wang - Journal of Machine Learning …, 2016 - jmlr.org
In the past few years, differential privacy has become a standard concept in the area of
privacy. One of the most important problems in this field is to answer queries while …

Combinatorial pure exploration of multi-armed bandits

S Chen, T Lin, I King, MR Lyu… - Advances in neural …, 2014 - proceedings.neurips.cc
We study the {\em combinatorial pure exploration (CPE)} problem in the stochastic multi-
armed bandit setting, where a learner explores a set of arms with the objective of identifying …

Combinatorial bandits revisited

R Combes… - Advances in neural …, 2015 - proceedings.neurips.cc
This paper investigates stochastic and adversarial combinatorial multi-armed bandit
problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific …