Nearly optimal best-of-both-worlds algorithms for online learning with feedback graphs

S Ito, T Tsuchiya, J Honda - Advances in Neural Information …, 2022 - proceedings.neurips.cc
This study considers online learning with general directed feedback graphs. For this
problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds …

A near-optimal best-of-both-worlds algorithm for online learning with feedback graphs

C Rouyer, D van der Hoeven… - Advances in …, 2022 - proceedings.neurips.cc
We consider online learning with feedback graphs, a sequential decision-making framework
where the learner's feedback is determined by a directed graph over the action set. We …

On the minimax regret for online learning with feedback graphs

K Eldowa, E Esposito, T Cesari… - Advances in Neural …, 2023 - proceedings.neurips.cc
In this work, we improve on the upper and lower bounds for the regret of online learning with
strongly observable undirected feedback graphs. The best known upper bound for this …

An -regret analysis of Adversarial Bilateral Trade

Y Azar, A Fiat, F Fusco - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …

Can probabilistic feedback drive user impacts in online platforms?

J Dai, B Flanigan, M Jagadeesan… - International …, 2024 - proceedings.mlr.press
A common explanation for negative user impacts of content recommender systems is
misalignment between the platform's objective and user welfare. In this work, we show that …

Nonstochastic contextual combinatorial bandits

L Zierahn, D van der Hoeven… - International …, 2023 - proceedings.mlr.press
We study a contextual version of online combinatorial optimisation with full and semi-bandit
feedback. In this sequential decision-making problem, an online learner has to select an …

Practical contextual bandits with feedback graphs

M Zhang, Y Zhang, O Vrousgou… - Advances in Neural …, 2023 - proceedings.neurips.cc
While contextual bandit has a mature theory, effectively leveraging different feedback
patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs …

Efficient contextual bandits with uninformed feedback graphs

M Zhang, Y Zhang, H Luo, P Mineiro - arxiv preprint arxiv:2402.08127, 2024 - arxiv.org
Bandits with feedback graphs are powerful online learning models that interpolate between
the full information and classic bandit problems, capturing many real-life applications. A …

Online learning with feedback graphs: The true shape of regret

T Kocák, A Carpentier - International Conference on …, 2023 - proceedings.mlr.press
Sequential learning with feedback graphs is a natural extension of the multi-armed bandit
problem where the problem is equipped with an underlying graph structure that provides …

Online network source optimization with graph-kernel MAB

L Toni, P Frossard - Joint European Conference on Machine Learning and …, 2023 - Springer
We propose Grab-UCB, a graph-kernel multi-arms bandit algorithm to learn online the
optimal source placement in large scale networks, such that the reward obtained from a …