Online learning: A comprehensive survey

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

[LIBRO][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Cascading bandits: Learning to rank in the cascade model

B Kveton, C Szepesvari, Z Wen… - … conference on machine …, 2015 - proceedings.mlr.press
A search engine usually outputs a list of K web pages. The user examines this list, from the
first web page to the last, and chooses the first attractive page. This model of user behavior …

Robust influence maximization

W Chen, T Lin, Z Tan, M Zhao, X Zhou - Proceedings of the 22nd ACM …, 2016 - dl.acm.org
In this paper, we address the important issue of uncertainty in the edge influence probability
estimates for the well studied influence maximization problem---the task of finding k seed …

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press
We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

Hierarchical bayesian bandits

J Hong, B Kveton, M Zaheer… - International …, 2022 - proceedings.mlr.press
Abstract Meta-, multi-task, and federated learning can be all viewed as solving similar tasks,
drawn from a distribution that reflects task similarities. We provide a unified view of all these …

Combinatorial multi-armed bandit with general reward functions

W Chen, W Hu, F Li, J Li, Y Liu… - Advances in Neural …, 2016 - proceedings.neurips.cc
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …

Contextual combinatorial cascading bandits

S Li, B Wang, S Zhang, W Chen - … conference on machine …, 2016 - proceedings.mlr.press
We propose the contextual combinatorial cascading bandits, a combinatorial online learning
game, where at each time step a learning agent is given a set of contextual information, then …

Online influence maximization under independent cascade model with semi-bandit feedback

Z Wen, B Kveton, M Valko… - Advances in neural …, 2017 - proceedings.neurips.cc
We study the online influence maximization problem in social networks under the
independent cascade model. Specifically, we aim to learn the set of" best influencers" in a …

Multi-round influence maximization

L Sun, W Huang, PS Yu, W Chen - Proceedings of the 24th ACM …, 2018 - dl.acm.org
In this paper, we study the Multi-Round Influence Maximization (MRIM) problem, where
influence propagates in multiple rounds independently from possibly different seed sets, and …