Safe reinforcement learning via shielding

M Alshiekh, R Bloem, R Ehlers, B Könighofer… - Proceedings of the …, 2018 - ojs.aaai.org
Reinforcement learning algorithms discover policies that maximize reward, but do not
necessarily guarantee safety during learning or execution phases. We introduce a new …

Safe reinforcement learning via probabilistic shields

N Jansen, B Könighofer, S Junges, AC Serban… - arxiv preprint arxiv …, 2018 - arxiv.org
This paper targets the efficient construction of a safety shield for decision making in
scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent …

Introduction to model checking

EM Clarke, TA Henzinger, H Veith - Handbook of Model Checking, 2018 - Springer
Abstract Model checking is a computer-assisted method for the analysis of dynamical
systems that can be modeled by state-transition systems. Drawing from research traditions in …

Learning safe control for multi-robot systems: Methods, verification, and open challenges

K Garg, S Zhang, O So, C Dawson, C Fan - Annual Reviews in Control, 2024 - Elsevier
In this survey, we review the recent advances in control design methods for robotic multi-
agent systems (MAS), focusing on learning-based methods with safety considerations. We …

Runtime monitoring of dynamic fairness properties

T Henzinger, M Karimi, K Kueffner… - Proceedings of the 2023 …, 2023 - dl.acm.org
A machine-learned system that is fair in static decision-making tasks may have biased
societal impacts in the long-run. This may happen when the system interacts with humans …

Verification-Guided Shielding for Deep Reinforcement Learning

D Corsi, G Amir, A Rodríguez, C Sánchez… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach
to solving real-world tasks. However, despite their successes, DRL-based policies suffer …

Enhancing safety in learning from demonstration algorithms via control barrier function shielding

Y Yang, L Chen, Z Zaidi, S van Waveren… - Proceedings of the …, 2024 - dl.acm.org
Learning from Demonstration (LfD) is a powerful method for non-roboticists end-users to
teach robots new tasks, enabling them to customize the robot behavior. However, modern …

[HTML][HTML] Risk-aware shielding of partially observable monte carlo planning policies

G Mazzi, A Castellini, A Farinelli - Artificial Intelligence, 2023 - Elsevier
Abstract Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm
that can generate approximate policies for large Partially Observable Markov Decision …