Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Last-iterate convergent policy gradient primal-dual methods for constrained mdps
We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …
constrained Markov decision process (constrained MDP). Despite the popularity of …
Structure in deep reinforcement learning: A survey and open problems
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …
Networks (DNNs) for function approximation, has demonstrated considerable success in …
Safety-constrained reinforcement learning with a distributional safety critic
Safety is critical to broadening the real-world use of reinforcement learning. Modeling the
safety aspects using a safety-cost signal separate from the reward and bounding the …
safety aspects using a safety-cost signal separate from the reward and bounding the …
Reload: Reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained mdps
In recent years, reinforcement learning (RL) has been applied to real-world problems with
increasing success. Such applications often require to put constraints on the agent's …
increasing success. Such applications often require to put constraints on the agent's …
DOPE: Doubly optimistic and pessimistic exploration for safe reinforcement learning
Safe reinforcement learning is extremely challenging--not only must the agent explore an
unknown environment, it must do so while ensuring no safety constraint violations. We …
unknown environment, it must do so while ensuring no safety constraint violations. We …
Verification-Guided Shielding for Deep Reinforcement Learning
In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …
Autonomous driving based on approximate safe action
X Wang, J Zhang, D Hou… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Safety limits the application of traditional reinforcement learning (RL) methods to
autonomous driving. To address the challenge of safe exploration in autonomous driving …
autonomous driving. To address the challenge of safe exploration in autonomous driving …
Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms
In domains such as electric vehicle charging, smart distribution grids and autonomous
warehouses, multiple agents share the same resources. When planning the use of these …
warehouses, multiple agents share the same resources. When planning the use of these …
Enhancing safe exploration using safety state augmentation
Safe exploration is a challenging and important problem in model-free reinforcement
learning (RL). Often the safety cost is sparse and unknown, which unavoidably leads to …
learning (RL). Often the safety cost is sparse and unknown, which unavoidably leads to …
Safe reinforcement learning with natural language constraints
While safe reinforcement learning (RL) holds great promise for many practical applications
like robotics or autonomous cars, current approaches require specifying constraints in …
like robotics or autonomous cars, current approaches require specifying constraints in …