Federated linear contextual bandits
This paper presents a novel federated linear contextual bandits model, where individual
clients face different $ K $-armed stochastic bandits coupled through common global …
clients face different $ K $-armed stochastic bandits coupled through common global …
Corruption-robust offline reinforcement learning with general function approximation
We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …
with general function approximation, where an adversary can corrupt each sample in the …
Nearly optimal algorithms for linear contextual bandits with adversarial corruptions
We study the linear contextual bandit problem in the presence of adversarial corruption,
where the reward at each round is corrupted by an adversary, and the corruption level (ie …
where the reward at each round is corrupted by an adversary, and the corruption level (ie …
Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes
Despite the significant interest and progress in reinforcement learning (RL) problems with
adversarial corruption, current works are either confined to the linear setting or lead to an …
adversarial corruption, current works are either confined to the linear setting or lead to an …
A model selection approach for corruption robust reinforcement learning
We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …
Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously
In this work, we develop linear bandit algorithms that automatically adapt to different
environments. By plugging a novel loss estimator into the optimization problem that …
environments. By plugging a novel loss estimator into the optimization problem that …
Robust lipschitz bandits to adversarial corruptions
Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set
defined on a metric space, where the reward function is subject to a Lipschitz constraint. In …
defined on a metric space, where the reward function is subject to a Lipschitz constraint. In …
Robust stochastic linear contextual bandits under adversarial attacks
Stochastic linear contextual bandit algorithms have substantial applications in practice, such
as recommender systems, online advertising, clinical trials, etc. Recent works show that …
as recommender systems, online advertising, clinical trials, etc. Recent works show that …
Reward poisoning attacks on offline multi-agent reinforcement learning
In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given
dataset. We study reward-poisoning attacks in this setting where an exogenous attacker …
dataset. We study reward-poisoning attacks in this setting where an exogenous attacker …
Exploiting heterogeneity in robust federated best-arm identification
We study a federated variant of the best-arm identification problem in stochastic multi-armed
bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via …
bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via …