Trustworthy distributed ai systems: Robustness, privacy, and governance

W Wei, L Liu - ACM Computing Surveys, 2024 - dl.acm.org
Emerging Distributed AI systems are revolutionizing big data computing and data
processing capabilities with growing economic and societal impact. However, recent studies …

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc
We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

Bypassing the simulator: Near-optimal adversarial linear contextual bandits

H Liu, CY Wei, J Zimmert - Advances in Neural Information …, 2024 - proceedings.neurips.cc
We consider the adversarial linear contextual bandit problem, where the loss vectors are
selected fully adversarially and the per-round action set (ie the context) is drawn from a fixed …

Feel-good thompson sampling for contextual bandits and reinforcement learning

T Zhang - SIAM Journal on Mathematics of Data Science, 2022 - SIAM
Thompson sampling has been widely used for contextual bandit problems due to the
flexibility of its modeling power. However, a general theory for this class of methods in the …

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

J He, D Zhou, T Zhang, Q Gu - Advances in neural …, 2022 - proceedings.neurips.cc
We study the linear contextual bandit problem in the presence of adversarial corruption,
where the reward at each round is corrupted by an adversary, and the corruption level (ie …

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

C Ye, W **ong, Q Gu, T Zhang - International Conference on …, 2023 - proceedings.mlr.press
Despite the significant interest and progress in reinforcement learning (RL) problems with
adversarial corruption, current works are either confined to the linear setting or lead to an …

Contextual bandits with large action spaces: Made practical

Y Zhu, DJ Foster, J Langford… - … Conference on Machine …, 2022 - proceedings.mlr.press
A central problem in sequential decision making is to develop algorithms that are practical
and computationally efficient, yet support the use of flexible, general-purpose models …

Bayesian decision-making under misspecified priors with applications to meta-learning

M Simchowitz, C Tosh… - Advances in …, 2021 - proceedings.neurips.cc
Thompson sampling and other Bayesian sequential decision-making algorithms are among
the most popular approaches to tackle explore/exploit trade-offs in (contextual) bandits. The …

A model selection approach for corruption robust reinforcement learning

CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press
We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …