- Academic Search

R Huang, W Wu, J Yang… - Advances in neural …, 2021 - proceedings.neurips.cc

This paper presents a novel federated linear contextual bandits model, where individual
clients face different $ K $-armed stochastic bandits coupled through common global …

Gem Citer Citeret af 86 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc

We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

Gem Citer Citeret af 19 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

J He, D Zhou, T Zhang, Q Gu - Advances in neural …, 2022 - proceedings.neurips.cc

We study the linear contextual bandit problem in the presence of adversarial corruption,
where the reward at each round is corrupted by an adversary, and the corruption level (ie …

Gem Citer Citeret af 56 Relaterede artikler Alle 8 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

C Ye, W **ong, Q Gu, T Zhang - International Conference on …, 2023 - proceedings.mlr.press

Despite the significant interest and progress in reinforcement learning (RL) problems with
adversarial corruption, current works are either confined to the linear setting or lead to an …

Gem Citer Citeret af 30 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

A model selection approach for corruption robust reinforcement learning

CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press

We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …

Gem Citer Citeret af 64 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously

CW Lee, H Luo, CY Wei, M Zhang… - … on Machine Learning, 2021 - proceedings.mlr.press

In this work, we develop linear bandit algorithms that automatically adapt to different
environments. By plugging a novel loss estimator into the optimization problem that …

Gem Citer Citeret af 53 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Robust lipschitz bandits to adversarial corruptions

Y Kang, CJ Hsieh, TCM Lee - Advances in Neural …, 2024 - proceedings.neurips.cc

Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set
defined on a metric space, where the reward function is subject to a Lipschitz constraint. In …

Gem Citer Citeret af 11 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Robust stochastic linear contextual bandits under adversarial attacks

Q Ding, CJ Hsieh, J Sharpnack - … Conference on Artificial …, 2022 - proceedings.mlr.press

Stochastic linear contextual bandit algorithms have substantial applications in practice, such
as recommender systems, online advertising, clinical trials, etc. Recent works show that …

Gem Citer Citeret af 38 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Reward poisoning attacks on offline multi-agent reinforcement learning

Y Wu, J McMahan, X Zhu, Q **e - … of the aaai conference on artificial …, 2023 - ojs.aaai.org

In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given
dataset. We study reward-poisoning attacks in this setting where an exogenous attacker …

Gem Citer Citeret af 27 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploiting heterogeneity in robust federated best-arm identification

A Mitra, H Hassani, G Pappas - arxiv preprint arxiv:2109.05700, 2021 - arxiv.org

We study a federated variant of the best-arm identification problem in stochastic multi-armed
bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via …

Gem Citer Citeret af 31 Relaterede artikler Alle 3 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Stochastic linear bandits robust to adversarial attacks

Federated linear contextual bandits

Corruption-robust offline reinforcement learning with general function approximation

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

A model selection approach for corruption robust reinforcement learning

Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously

Robust lipschitz bandits to adversarial corruptions

Robust stochastic linear contextual bandits under adversarial attacks

Reward poisoning attacks on offline multi-agent reinforcement learning

Exploiting heterogeneity in robust federated best-arm identification