Federated linear contextual bandits

R Huang, W Wu, J Yang… - Advances in neural …, 2021 - proceedings.neurips.cc
This paper presents a novel federated linear contextual bandits model, where individual
clients face different $ K $-armed stochastic bandits coupled through common global …

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc
We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

J He, D Zhou, T Zhang, Q Gu - Advances in neural …, 2022 - proceedings.neurips.cc
We study the linear contextual bandit problem in the presence of adversarial corruption,
where the reward at each round is corrupted by an adversary, and the corruption level (ie …

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

C Ye, W **ong, Q Gu, T Zhang - International Conference on …, 2023 - proceedings.mlr.press
Despite the significant interest and progress in reinforcement learning (RL) problems with
adversarial corruption, current works are either confined to the linear setting or lead to an …

A model selection approach for corruption robust reinforcement learning

CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press
We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …

Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously

CW Lee, H Luo, CY Wei, M Zhang… - … on Machine Learning, 2021 - proceedings.mlr.press
In this work, we develop linear bandit algorithms that automatically adapt to different
environments. By plugging a novel loss estimator into the optimization problem that …

Robust lipschitz bandits to adversarial corruptions

Y Kang, CJ Hsieh, TCM Lee - Advances in Neural …, 2024 - proceedings.neurips.cc
Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set
defined on a metric space, where the reward function is subject to a Lipschitz constraint. In …

Robust stochastic linear contextual bandits under adversarial attacks

Q Ding, CJ Hsieh, J Sharpnack - … Conference on Artificial …, 2022 - proceedings.mlr.press
Stochastic linear contextual bandit algorithms have substantial applications in practice, such
as recommender systems, online advertising, clinical trials, etc. Recent works show that …

Reward poisoning attacks on offline multi-agent reinforcement learning

Y Wu, J McMahan, X Zhu, Q **e - … of the aaai conference on artificial …, 2023 - ojs.aaai.org
In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given
dataset. We study reward-poisoning attacks in this setting where an exogenous attacker …

Exploiting heterogeneity in robust federated best-arm identification

A Mitra, H Hassani, G Pappas - arxiv preprint arxiv:2109.05700, 2021 - arxiv.org
We study a federated variant of the best-arm identification problem in stochastic multi-armed
bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via …