Wasserstein robust reinforcement learning

MA Abdullah, H Ren, HB Ammar, V Milenkovic… - arxiv preprint arxiv …, 2019 - arxiv.org
Reinforcement learning algorithms, though successful, tend to over-fit to training
environments hampering their application to the real-world. This paper proposes $\text …

Robust -Divergence MDPs

CP Ho, M Petrik, W Wiesemann - Advances in Neural …, 2022 - proceedings.neurips.cc
In recent years, robust Markov decision processes (MDPs) have emerged as a prominent
modeling framework for dynamic decision problems affected by uncertainty. In contrast to …

Beyond confidence regions: Tight bayesian ambiguity sets for robust mdps

M Petrik, RH Russel - Advances in neural information …, 2019 - proceedings.neurips.cc
Abstract Robust MDPs (RMDPs) can be used to compute policies with provable worst-case
guarantees in reinforcement learning. The quality and robustness of an RMDP solution are …

Distributionally robust reinforcement learning

E Smirnova, E Dohmatob, J Mary - arxiv preprint arxiv:1902.08708, 2019 - arxiv.org
Real-world applications require RL algorithms to act safely. During learning process, it is
likely that the agent executes sub-optimal actions that may lead to unsafe/poor states of the …

Robust Q-learning algorithm for Markov decision processes under Wasserstein uncertainty

A Neufeld, J Sester - Automatica, 2024 - Elsevier
We present a novel Q-learning algorithm tailored to solve distributionally robust Markov
decision problems where the corresponding ambiguity set of transition probabilities for the …

A bayesian approach to robust reinforcement learning

E Derman, D Mankowitz, T Mann… - Uncertainty in Artificial …, 2020 - proceedings.mlr.press
Abstract Robust Markov Decision Processes (RMDPs) intend to ensure robustness with
respect to changing or adversarial system behavior. In this framework, transitions are …

Bayesian robust optimization for imitation learning

D Brown, S Niekum, M Petrik - Advances in Neural …, 2020 - proceedings.neurips.cc
One of the main challenges in imitation learning is determining what action an agent should
take when outside the state distribution of the demonstrations. Inverse reinforcement …

Sequential decision-making under uncertainty: A robust mdps review

W Ou, S Bi - arxiv preprint arxiv:2404.00940, 2024 - arxiv.org
Fueled by both advances in robust optimization theory and applications of reinforcement
learning, robust Markov Decision Processes (RMDPs) have gained increasing attention, due …

Byzantine-resilient decentralized policy evaluation with linear function approximation

Z Wu, H Shen, T Chen, Q Ling - IEEE Transactions on Signal …, 2021 - ieeexplore.ieee.org
In this paper, we consider the policy evaluation problem in reinforcement learning with
agents on a decentralized and directed network. In order to evaluate the quality of a fixed …

Robust Multiobjective Reinforcement Learning Considering Environmental Uncertainties

X He, J Hao, X Chen, J Wang, X Ji… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Numerous real-world decision or control problems involve multiple conflicting objectives
whose relative importance (preference) is required to be weighed in different scenarios …