Sample complexity of robust reinforcement learning with a generative model

K Panaganti, D Kalathil - International Conference on …, 2022 - proceedings.mlr.press
Abstract The Robust Markov Decision Process (RMDP) framework focuses on designing
control policies that are robust against the parameter uncertainties due to the mismatches …

Toward theoretical understandings of robust markov decision processes: Sample complexity and asymptotics

W Yang, L Zhang, Z Zhang - The Annals of Statistics, 2022 - projecteuclid.org
Toward theoretical understandings of robust Markov decision processes: Sample
complexity and asymptotics Page 1 The Annals of Statistics 2022, Vol. 50, No. 6, 3223–3248 …

Offline reinforcement learning as anti-exploration

S Rezaeifar, R Dadashi, N Vieillard… - Proceedings of the …, 2022 - ojs.aaai.org
Abstract Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed
dataset, without interactions with the system. An agent in this setting should avoid selecting …

Safe policy improvement by minimizing robust baseline regret

M Ghavamzadeh, M Petrik… - Advances in Neural …, 2016 - proceedings.neurips.cc
An important problem in sequential decision-making under uncertainty is to use limited data
to compute a safe policy, ie, a policy that is guaranteed to perform at least as well as a given …

Policy gradient in robust mdps with global convergence guarantee

Q Wang, CP Ho, M Petrik - International Conference on …, 2023 - proceedings.mlr.press
Abstract Robust Markov decision processes (RMDPs) provide a promising framework for
computing reliable policies in the face of model errors. Many successful reinforcement …

Fast bellman updates for wasserstein distributionally robust mdps

Z Yu, L Dai, S Xu, S Gao, CP Ho - Advances in Neural …, 2023 - proceedings.neurips.cc
Markov decision processes (MDPs) often suffer from the sensitivity issue under model
ambiguity. In recent years, robust MDPs have emerged as an effective framework to …

Partial policy iteration for l1-robust markov decision processes

CP Ho, M Petrik, W Wiesemann - Journal of Machine Learning Research, 2021 - jmlr.org
Robust Markov decision processes (MDPs) compute reliable solutions for dynamic decision
problems with partially-known transition probabilities. Unfortunately, accounting for …

Confounding-robust policy evaluation in infinite-horizon reinforcement learning

N Kallus, A Zhou - Advances in neural information …, 2020 - proceedings.neurips.cc
Off-policy evaluation of sequential decision policies from observational data is necessary in
applications of batch reinforcement learning such as education and healthcare. In such …

Robust -Divergence MDPs

CP Ho, M Petrik, W Wiesemann - Advances in Neural …, 2022 - proceedings.neurips.cc
In recent years, robust Markov decision processes (MDPs) have emerged as a prominent
modeling framework for dynamic decision problems affected by uncertainty. In contrast to …

Beyond confidence regions: Tight bayesian ambiguity sets for robust mdps

M Petrik, RH Russel - Advances in neural information …, 2019 - proceedings.neurips.cc
Abstract Robust MDPs (RMDPs) can be used to compute policies with provable worst-case
guarantees in reinforcement learning. The quality and robustness of an RMDP solution are …