Reload: Reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained mdps

T Moskovitz, B O'Donoghue, V Veeriah… - International …, 2023 - proceedings.mlr.press
In recent years, reinforcement learning (RL) has been applied to real-world problems with
increasing success. Such applications often require to put constraints on the agent's …

Efficient off-policy safe reinforcement learning using trust region conditional value at risk

D Kim, S Oh - IEEE Robotics and Automation Letters, 2022 - ieeexplore.ieee.org
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based
constraints. As risk measures, such as conditional value at risk (CVaR), focus on the tail …

Safe adaptive policy transfer reinforcement learning for distributed multiagent control

B Du, W ** for multi-constraint safe reinforcement learning
Y Yao, Z Liu, Z Cen, P Huang… - … Annual Learning for …, 2024 - proceedings.mlr.press
Online safe reinforcement learning (RL) involves training a policy that maximizes task
efficiency while satisfying constraints via interacting with the environments. In this paper, our …

Scaling pareto-efficient decision making via offline multi-objective rl

B Zhu, M Dang, A Grover - ar** for offline safe reinforcement learning
Y Yao, Z Cen, W Ding, H Lin, S Liu, T Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using
a pre-collected dataset. Most current methods struggle with the mismatch between imperfect …

Safe and balanced: A framework for constrained multi-objective reinforcement learning

S Gu, B Sel, Y Ding, L Wang, Q Lin… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
In numerous reinforcement learning (RL) problems involving safety-critical systems, a key
challenge lies in balancing multiple objectives while simultaneously meeting all stringent …

Scale-Invariant Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning

D Kim, M Hong, J Park, S Oh - arxiv preprint arxiv:2403.00282, 2024 - arxiv.org
Multi-objective reinforcement learning (MORL) aims to find a set of Pareto optimal policies to
cover various preferences. However, to apply MORL in real-world applications, it is …

IR-Aware ECO Timing Optimization Using Reinforcement Learning

W Jiang, VA Chhabria, SS Sapatnekar - arxiv preprint arxiv:2402.07781, 2024 - arxiv.org
Engineering change orders (ECOs) in late stages make minimal design fixes to recover from
timing shifts due to excessive IR drops. This paper integrates IR-drop-aware timing analysis …

Feasibility Consistent Representation Learning for Safe Reinforcement Learning

Z Cen, Y Yao, Z Liu, D Zhao - arxiv preprint arxiv:2405.11718, 2024 - arxiv.org
In the field of safe reinforcement learning (RL), finding a balance between satisfying safety
constraints and optimizing reward performance presents a significant challenge. A key …