Google Академик

S Ito, T Tsuchiya, J Honda - Advances in Neural Information …, 2022 - proceedings.neurips.cc

This study considers online learning with general directed feedback graphs. For this
problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds …

Сачувај Цитирај 29 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A near-optimal best-of-both-worlds algorithm for online learning with feedback graphs

C Rouyer, D van der Hoeven… - Advances in …, 2022 - proceedings.neurips.cc

We consider online learning with feedback graphs, a sequential decision-making framework
where the learner's feedback is determined by a directed graph over the action set. We …

Сачувај Цитирај 20 пута наведен Сродни чланци Све верзије (9) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On the minimax regret for online learning with feedback graphs

K Eldowa, E Esposito, T Cesari… - Advances in Neural …, 2023 - proceedings.neurips.cc

In this work, we improve on the upper and lower bounds for the regret of online learning with
strongly observable undirected feedback graphs. The best known upper bound for this …

Сачувај Цитирај 9 пута наведен Сродни чланци Све верзије (13) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

An -regret analysis of Adversarial Bilateral Trade

Y Azar, A Fiat, F Fusco - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …

Сачувај Цитирај 15 пута наведен Сродни чланци Све верзије (13) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Can probabilistic feedback drive user impacts in online platforms?

J Dai, B Flanigan, M Jagadeesan… - International …, 2024 - proceedings.mlr.press

A common explanation for negative user impacts of content recommender systems is
misalignment between the platform's objective and user welfare. In this work, we show that …

Сачувај Цитирај 5 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nonstochastic contextual combinatorial bandits

L Zierahn, D van der Hoeven… - International …, 2023 - proceedings.mlr.press

We study a contextual version of online combinatorial optimisation with full and semi-bandit
feedback. In this sequential decision-making problem, an online learner has to select an …

Сачувај Цитирај 5 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Practical contextual bandits with feedback graphs

M Zhang, Y Zhang, O Vrousgou… - Advances in Neural …, 2023 - proceedings.neurips.cc

While contextual bandit has a mature theory, effectively leveraging different feedback
patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs …

Сачувај Цитирај 4 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient contextual bandits with uninformed feedback graphs

M Zhang, Y Zhang, H Luo, P Mineiro - arxiv preprint arxiv:2402.08127, 2024 - arxiv.org

Bandits with feedback graphs are powerful online learning models that interpolate between
the full information and classic bandit problems, capturing many real-life applications. A …

Сачувај Цитирај 2 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Online learning with feedback graphs: The true shape of regret

T Kocák, A Carpentier - International Conference on …, 2023 - proceedings.mlr.press

Sequential learning with feedback graphs is a natural extension of the multi-armed bandit
problem where the problem is equipped with an underlying graph structure that provides …

Сачувај Цитирај 2 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Online network source optimization with graph-kernel MAB

L Toni, P Frossard - Joint European Conference on Machine Learning and …, 2023 - Springer

We propose Grab-UCB, a graph-kernel multi-arms bandit algorithm to learn online the
optimal source placement in large scale networks, such that the reward obtained from a …

Сачувај Цитирај 2 пута наведен Сродни чланци Све верзије (10)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Learning on the edge: Online learning with stochastic feedback graphs

Nearly optimal best-of-both-worlds algorithms for online learning with feedback graphs

A near-optimal best-of-both-worlds algorithm for online learning with feedback graphs

On the minimax regret for online learning with feedback graphs

An -regret analysis of Adversarial Bilateral Trade

Can probabilistic feedback drive user impacts in online platforms?

Nonstochastic contextual combinatorial bandits

Practical contextual bandits with feedback graphs

Efficient contextual bandits with uninformed feedback graphs

Online learning with feedback graphs: The true shape of regret

Online network source optimization with graph-kernel MAB