A survey of online experiment design with the stochastic multi-armed bandit

G Burtini, J Loeppky, R Lawrence - arxiv preprint arxiv:1510.00757, 2015 - arxiv.org
Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …

Regret analysis of bandit problems with causal background knowledge

Y Lu, A Meisami, A Tewari… - Conference on uncertainty …, 2020 - proceedings.mlr.press
We study how to learn optimal interventions sequentially given causal information
represented as a causal graph along with associated conditional distributions. Causal …

Multi-armed bandit problem with known trend

D Bouneffouf, R Féraud - Neurocomputing, 2016 - Elsevier
We consider a variant of the multi-armed bandit model, which we call multi-armed bandit
problem with known trend, where the gambler knows the shape of the reward function of …

Ensemble recommendations via thompson sampling: an experimental study within e-commerce

B Brodén, M Hammar, BJ Nilsson… - Proceedings of the 23rd …, 2018 - dl.acm.org
This work presents an extension of Thompson Sampling bandit policy for orchestrating the
collection of base recommendation algorithms for e-commerce. We focus on the problem of …

A definition of non-stationary bandits

Y Liu, X Kuang, B Van Roy - arxiv preprint arxiv:2302.12202, 2023 - arxiv.org
Despite the subject of non-stationary bandit learning having attracted much recent attention,
we have yet to identify a formal definition of non-stationarity that can consistently distinguish …

A systematic literature review of solutions for cold start problem

N Singh, SK Singh - … Journal of System Assurance Engineering and …, 2024 - Springer
Insufficient knowledge about a new bug or a new developer, in the context of
recommendations done in software bug repositories (SBR) mining, impacts the …

Feature-based and adaptive rule adaptation in dynamic environments

A Tabebordbar, A Beheshti, B Benatallah… - Data Science and …, 2020 - Springer
Rule-based systems have been used increasingly to augment learning algorithms for
annotating data. Rules alleviate many of the shortcomings inherent in pure algorithmic …

Multiarmed bandits for sleep recognition of elderly living in single-resident smart homes

ZK Shahid, S Saguna, C Åhlund - IEEE Internet of Things …, 2023 - ieeexplore.ieee.org
Sleep is an essential activity that affects an individual's health and ability to perform activities
of daily living (ADL). Inadequate sleep reduces cognitive capacity and leads to health …

Non-stationary contextual bandit learning via neural predictive ensemble sampling

Z Zhu, Y Liu, X Kuang, B Van Roy - arxiv preprint arxiv:2310.07786, 2023 - arxiv.org
Real-world applications of contextual bandits often exhibit non-stationarity due to
seasonality, serendipity, and evolving social trends. While a number of non-stationary …

Algorithmic and ethical aspects of recommender systems in E-commerce

D Paraschakis - 2018 - diva-portal.org
Recommender systems have become an integral part of virtually every e-commerce
application on the web. The deployment of these expert systems has enabled users to …