Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Safe reinforcement learning via shielding
Reinforcement learning algorithms discover policies that maximize reward, but do not
necessarily guarantee safety during learning or execution phases. We introduce a new …
necessarily guarantee safety during learning or execution phases. We introduce a new …
Safe multi-agent reinforcement learning via shielding
I ElSayed-Aly, S Bharadwaj, C Amato, R Ehlers… - ar**
useful machine-learning applications, their wider adoption has been hindered by the lack of …
useful machine-learning applications, their wider adoption has been hindered by the lack of …
Safe reinforcement learning via probabilistic shields
This paper targets the efficient construction of a safety shield for decision making in
scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent …
scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent …
Introduction to model checking
Abstract Model checking is a computer-assisted method for the analysis of dynamical
systems that can be modeled by state-transition systems. Drawing from research traditions in …
systems that can be modeled by state-transition systems. Drawing from research traditions in …
Learning safe control for multi-robot systems: Methods, verification, and open challenges
In this survey, we review the recent advances in control design methods for robotic multi-
agent systems (MAS), focusing on learning-based methods with safety considerations. We …
agent systems (MAS), focusing on learning-based methods with safety considerations. We …
Runtime monitoring of dynamic fairness properties
A machine-learned system that is fair in static decision-making tasks may have biased
societal impacts in the long-run. This may happen when the system interacts with humans …
societal impacts in the long-run. This may happen when the system interacts with humans …
Verification-Guided Shielding for Deep Reinforcement Learning
In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …
Enhancing safety in learning from demonstration algorithms via control barrier function shielding
Learning from Demonstration (LfD) is a powerful method for non-roboticists end-users to
teach robots new tasks, enabling them to customize the robot behavior. However, modern …
teach robots new tasks, enabling them to customize the robot behavior. However, modern …
[HTML][HTML] Risk-aware shielding of partially observable monte carlo planning policies
Abstract Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm
that can generate approximate policies for large Partially Observable Markov Decision …
that can generate approximate policies for large Partially Observable Markov Decision …