Reload: Reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained mdps
In recent years, reinforcement learning (RL) has been applied to real-world problems with
increasing success. Such applications often require to put constraints on the agent's …
increasing success. Such applications often require to put constraints on the agent's …
Efficient off-policy safe reinforcement learning using trust region conditional value at risk
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based
constraints. As risk measures, such as conditional value at risk (CVaR), focus on the tail …
constraints. As risk measures, such as conditional value at risk (CVaR), focus on the tail …
Safe adaptive policy transfer reinforcement learning for distributed multiagent control
B Du, W ** for multi-constraint safe reinforcement learning
Online safe reinforcement learning (RL) involves training a policy that maximizes task
efficiency while satisfying constraints via interacting with the environments. In this paper, our …
efficiency while satisfying constraints via interacting with the environments. In this paper, our …
Scaling pareto-efficient decision making via offline multi-objective rl
Safe and balanced: A framework for constrained multi-objective reinforcement learning
In numerous reinforcement learning (RL) problems involving safety-critical systems, a key
challenge lies in balancing multiple objectives while simultaneously meeting all stringent …
challenge lies in balancing multiple objectives while simultaneously meeting all stringent …
Scale-Invariant Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
Multi-objective reinforcement learning (MORL) aims to find a set of Pareto optimal policies to
cover various preferences. However, to apply MORL in real-world applications, it is …
cover various preferences. However, to apply MORL in real-world applications, it is …
IR-Aware ECO Timing Optimization Using Reinforcement Learning
W Jiang, VA Chhabria, SS Sapatnekar - arxiv preprint arxiv:2402.07781, 2024 - arxiv.org
Engineering change orders (ECOs) in late stages make minimal design fixes to recover from
timing shifts due to excessive IR drops. This paper integrates IR-drop-aware timing analysis …
timing shifts due to excessive IR drops. This paper integrates IR-drop-aware timing analysis …
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
In the field of safe reinforcement learning (RL), finding a balance between satisfying safety
constraints and optimizing reward performance presents a significant challenge. A key …
constraints and optimizing reward performance presents a significant challenge. A key …