A near-optimal best-of-both-worlds algorithm for online learning with feedback graphs

C Rouyer, D van der Hoeven… - Advances in …, 2022 - proceedings.neurips.cc
We consider online learning with feedback graphs, a sequential decision-making framework
where the learner's feedback is determined by a directed graph over the action set. We …

An -regret analysis of Adversarial Bilateral Trade

Y Azar, A Fiat, F Fusco - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …

Learning on the edge: Online learning with stochastic feedback graphs

E Esposito, F Fusco… - Advances in …, 2022 - proceedings.neurips.cc
The framework of feedback graphs is a generalization of sequential decision-making with
bandit or full information feedback. In this work, we study an extension where the directed …

Online structured prediction with Fenchel–Young losses and improved surrogate regret for online multiclass classification with logistic loss

S Sakaue, H Bao, T Tsuchiya… - The Thirty Seventh …, 2024 - proceedings.mlr.press
This paper studies online structured prediction with full-information feedback. For online
multiclass classification, Van der Hoeven (2020) established\emph {finite} surrogate regret …

Practical contextual bandits with feedback graphs

M Zhang, Y Zhang, O Vrousgou… - Advances in Neural …, 2023 - proceedings.neurips.cc
While contextual bandit has a mature theory, effectively leveraging different feedback
patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs …

Efficient online set-valued classification with bandit feedback

Z Wang, X Qiao - arxiv preprint arxiv:2405.04393, 2024 - arxiv.org
Conformal prediction is a distribution-free method that wraps a given machine learning
model and returns a set of plausible labels that contain the true label with a prescribed …

A regret-variance trade-off in online learning

D Van der Hoeven, N Zhivotovskiy… - Advances in Neural …, 2022 - proceedings.neurips.cc
We consider prediction with expert advice for strongly convex and bounded losses, and
investigate trade-offs between regret and``variance''(ie, squared difference of learner's …

Online learning with set-valued feedback

V Raman, U Subedi, A Tewari - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
We study a variant of online multiclass classification where the learner predicts a single
label but receives a\textit {set of labels} as feedback. In this model, the learner is penalized …

Neural active learning meets the partial monitoring framework

M Heuillet, O Ahmad, A Durand - arxiv preprint arxiv:2405.08921, 2024 - arxiv.org
We focus on the online-based active learning (OAL) setting where an agent operates over a
stream of observations and trades-off between the costly acquisition of information (labelled …

Trading-off payments and accuracy in online classification with paid stochastic experts

D Van Der Hoeven, C Pike-Burke… - International …, 2023 - proceedings.mlr.press
We investigate online classification with paid stochastic experts. Here, before making their
prediction, each expert must be paid. The amount that we pay each expert directly …