Settling the sample complexity of model-based offline reinforcement learning

G Li, L Shi, Y Chen, Y Chi, Y Wei - The Annals of Statistics, 2024 - projecteuclid.org
Settling the sample complexity of model-based offline reinforcement learning Page 1 The
Annals of Statistics 2024, Vol. 52, No. 1, 233–260 https://doi.org/10.1214/23-AOS2342 © …

Distributionally robust model-based offline reinforcement learning with near-optimal sample complexity

L Shi, Y Chi - Journal of Machine Learning Research, 2024 - jmlr.org
This paper concerns the central issues of model robustness and sample efficiency in offline
reinforcement learning (RL), which aims to learn to perform decision making from history …

Distributionally robust off-dynamics reinforcement learning: Provable efficiency with linear function approximation

Z Liu, P Xu - International Conference on Artificial …, 2024 - proceedings.mlr.press
We study off-dynamics Reinforcement Learning (RL), where the policy is trained on a source
domain and deployed to a distinct target domain. We aim to solve this problem via online …

Seeing is not believing: Robust reinforcement learning against spurious correlation

W Ding, L Shi, Y Chi, D Zhao - Advances in Neural …, 2023 - proceedings.neurips.cc
Robustness has been extensively studied in reinforcement learning (RL) to handle various
forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this …

Minimax optimal and computationally efficient algorithms for distributionally robust offline reinforcement learning

Z Liu, P Xu - Advances in Neural Information Processing …, 2025 - proceedings.neurips.cc
Distributionally robust offline reinforcement learning (RL), which seeks robust policy training
against environment perturbation by modeling dynamics uncertainty, calls for function …

Settling the sample complexity of online reinforcement learning

Z Zhang, Y Chen, JD Lee… - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
A central issue lying at the heart of online reinforcement learning (RL) is data efficiency.
While a number of recent works achieved asymptotically minimal regret in online RL, the …

Sample-efficient robust multi-agent reinforcement learning in the face of environmental uncertainty

L Shi, E Mazumdar, Y Chi, A Wierman - arxiv preprint arxiv:2404.18909, 2024 - arxiv.org
To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must
maintain robustness against environmental uncertainties. While robust RL has been widely …

Sample complexity of offline distributionally robust linear markov decision processes

H Wang, L Shi, Y Chi - arxiv preprint arxiv:2403.12946, 2024 - arxiv.org
In offline reinforcement learning (RL), the absence of active exploration calls for attention on
the model robustness to tackle the sim-to-real gap, where the discrepancy between the …

Distributionally robust model-based reinforcement learning with large state spaces

SS Ramesh, PG Sessa, Y Hu… - International …, 2024 - proceedings.mlr.press
Three major challenges in reinforcement learning are the complex dynamical systems with
large state spaces, the costly data acquisition processes, and the deviation of real-world …

Towards minimax optimality of model-based robust reinforcement learning

P Clavier, EL Pennec, M Geist - arxiv preprint arxiv:2302.05372, 2023 - arxiv.org
We study the sample complexity of obtaining an $\epsilon $-optimal policy in\emph {Robust}
discounted Markov Decision Processes (RMDPs), given only access to a generative model …