Online learning: A comprehensive survey
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …
to tackle some predictive (or any type of decision-making) task by learning from a sequence …
[LIBRO][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Cascading bandits: Learning to rank in the cascade model
A search engine usually outputs a list of K web pages. The user examines this list, from the
first web page to the last, and chooses the first attractive page. This model of user behavior …
first web page to the last, and chooses the first attractive page. This model of user behavior …
Robust influence maximization
In this paper, we address the important issue of uncertainty in the edge influence probability
estimates for the well studied influence maximization problem---the task of finding k seed …
estimates for the well studied influence maximization problem---the task of finding k seed …
Thompson sampling for combinatorial semi-bandits
S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press
We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …
Hierarchical bayesian bandits
Abstract Meta-, multi-task, and federated learning can be all viewed as solving similar tasks,
drawn from a distribution that reflects task similarities. We provide a unified view of all these …
drawn from a distribution that reflects task similarities. We provide a unified view of all these …
Combinatorial multi-armed bandit with general reward functions
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …
that allows a general nonlinear reward function, whose expected value may not depend only …
Contextual combinatorial cascading bandits
We propose the contextual combinatorial cascading bandits, a combinatorial online learning
game, where at each time step a learning agent is given a set of contextual information, then …
game, where at each time step a learning agent is given a set of contextual information, then …
Online influence maximization under independent cascade model with semi-bandit feedback
We study the online influence maximization problem in social networks under the
independent cascade model. Specifically, we aim to learn the set of" best influencers" in a …
independent cascade model. Specifically, we aim to learn the set of" best influencers" in a …
Multi-round influence maximization
In this paper, we study the Multi-Round Influence Maximization (MRIM) problem, where
influence propagates in multiple rounds independently from possibly different seed sets, and …
influence propagates in multiple rounds independently from possibly different seed sets, and …