Last-iterate convergent policy gradient primal-dual methods for constrained mdps

D Ding, CY Wei, K Zhang… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …

Structure in deep reinforcement learning: A survey and open problems

A Mohan, A Zhang, M Lindauer - Journal of Artificial Intelligence Research, 2024‏ - jair.org
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

Safety-constrained reinforcement learning with a distributional safety critic

Q Yang, TD Simão, SH Tindemans, MTJ Spaan - Machine Learning, 2023‏ - Springer
Safety is critical to broadening the real-world use of reinforcement learning. Modeling the
safety aspects using a safety-cost signal separate from the reward and bounding the …

Reload: Reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained mdps

T Moskovitz, B O'Donoghue, V Veeriah… - International …, 2023‏ - proceedings.mlr.press
In recent years, reinforcement learning (RL) has been applied to real-world problems with
increasing success. Such applications often require to put constraints on the agent's …

DOPE: Doubly optimistic and pessimistic exploration for safe reinforcement learning

A Bura, A HasanzadeZonuzy… - Advances in neural …, 2022‏ - proceedings.neurips.cc
Safe reinforcement learning is extremely challenging--not only must the agent explore an
unknown environment, it must do so while ensuring no safety constraint violations. We …

Verification-Guided Shielding for Deep Reinforcement Learning

D Corsi, G Amir, A Rodríguez, C Sánchez… - arxiv preprint arxiv …, 2024‏ - arxiv.org
In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …

Autonomous driving based on approximate safe action

X Wang, J Zhang, D Hou… - IEEE Transactions on …, 2023‏ - ieeexplore.ieee.org
Safety limits the application of traditional reinforcement learning (RL) methods to
autonomous driving. To address the challenge of safe exploration in autonomous driving …

Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms

F De Nijs, E Walraven, M De Weerdt, M Spaan - Journal of Artificial …, 2021‏ - jair.org
In domains such as electric vehicle charging, smart distribution grids and autonomous
warehouses, multiple agents share the same resources. When planning the use of these …

Enhancing safe exploration using safety state augmentation

A Sootla, A Cowen-Rivers, J Wang… - Advances in Neural …, 2022‏ - proceedings.neurips.cc
Safe exploration is a challenging and important problem in model-free reinforcement
learning (RL). Often the safety cost is sparse and unknown, which unavoidably leads to …

Safe reinforcement learning with natural language constraints

TY Yang, MY Hu, Y Chow… - Advances in …, 2021‏ - proceedings.neurips.cc
While safe reinforcement learning (RL) holds great promise for many practical applications
like robotics or autonomous cars, current approaches require specifying constraints in …