Non stationary multi-armed bandit: Empirical evaluation of a new concept drift-aware algorithm

E Cavenaghi, G Sottocornola, F Stella, M Zanker - Entropy, 2021‏ - mdpi.com
The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address
real-world challenges related to sequential decision making. In this setting, an agent selects …

A definition of non-stationary bandits

Y Liu, X Kuang, B Van Roy - arxiv preprint arxiv:2302.12202, 2023‏ - arxiv.org
Despite the subject of non-stationary bandit learning having attracted much recent attention,
we have yet to identify a formal definition of non-stationarity that can consistently distinguish …

Learning and optimization with seasonal patterns

N Chen, C Wang, L Wang - Operations Research, 2023‏ - pubsonline.informs.org
A standard assumption adopted in the multiarmed bandit (MAB) framework is that the mean
rewards are constant over time. This assumption can be restrictive in the business world as …

Linear bandits with memory: from rotting to rising

G Clerici, P Laforgue, N Cesa-Bianchi - arxiv preprint arxiv:2302.08345, 2023‏ - arxiv.org
Nonstationary phenomena, such as satiation effects in recommendations, have mostly been
modeled using bandits with finitely many arms. However, the richer action space provided …

Non-Stationary Bandits with Periodic Behavior: Harnessing Ramanujan Periodicity Transforms to Conquer Time-Varying Challenges

P Thaker, V Gattani, V Tirukkonda… - ICASSP 2024-2024 …, 2024‏ - ieeexplore.ieee.org
In traditional multi-armed bandits (MAB), a standard assumption is that the mean rewards
are constant across each arm, a simplification that can be restrictive in nature. In many real …

A Survey on Techniques and Methods of Recommender System

A Raval, K Borisagar - … Conference on Computational Intelligence in Data …, 2022‏ - Springer
As prevalence is growing for social media, the value of its content is becoming
paramounting. This data can reveal about a person's personal and professional life. The …

Linear Bandits with Memory

G Clerici, P Laforgue, N Cesa Bianchi - Transactions on Machine …, 2024‏ - air.unimi.it
Nonstationary phenomena, such as satiation effects in recommendations, have mostly been
modeled using bandits with finitely many arms. However, the richer action space provided …

Data Efficient Sequential Decision Making in High Dimensions

VS Gattani - 2024‏ - search.proquest.com
As machine learning (ML) systems rapidly advance, their scale and data requirements have
surged, increasing the need for efficient data use while maintaining high performance and …

Lifelong Learning in Multi-Armed Bandits

M Jedor, J Louëdec, V Perchet - arxiv preprint arxiv:2012.14264, 2020‏ - arxiv.org
Continuously learning and leveraging the knowledge accumulated from prior tasks in order
to improve future performance is a long standing machine learning problem. In this paper …

Sequential decision problems in non-stationary environments

Y Russac - 2022‏ - theses.hal.science
The vanilla bandit model assumes thatthe rewards are independent andidentically
distributed. However, this assumption is restrictive: it prevents from modelingevolving …