[PDF][PDF] LTL and beyond: Formal languages for reward function specification in reinforcement learning.

A Camacho, RT Icarte, TQ Klassen, RA Valenzano… - IJCAI, 2019 - ijcai.org
Abstract In Reinforcement Learning (RL), an agent is guided by the rewards it receives from
the reward function. Unfortunately, it may take many interactions with the environment to …

[PDF][PDF] Teaching multiple tasks to an RL agent using LTL

R Toro Icarte, TQ Klassen, R Valenzano… - Proceedings of the 17th …, 2018 - ifaamas.org
Reinforcement Learning (RL) algorithms are capable of learning effective behaviours
through trial and error interactions with their environment [40]. The recent combination of …

Foundations for restraining bolts: Reinforcement learning with LTLf/LDLf restraining specifications

G De Giacomo, L Iocchi, M Favorito… - Proceedings of the …, 2019 - ojs.aaai.org
In this work we investigate on the concept of “restraining bolt”, envisioned in Science Fiction.
Specifically we introduce a novel problem in AI. We have two distinct sets of features …

A formal methods approach to interpretable reinforcement learning for robotic planning

X Li, Z Serlin, G Yang, C Belta - Science Robotics, 2019 - science.org
Growing interest in reinforcement learning approaches to robotic planning and control raises
concerns of predictability and safety of robot behaviors realized solely through learned …

Reinforcement learning with non-markovian rewards

M Gaon, R Brafman - Proceedings of the AAAI conference on artificial …, 2020 - ojs.aaai.org
The standard RL world model is that of a Markov Decision Process (MDP). A basic premise
of MDPs is that the rewards depend on the last state and action only. Yet, many real-world …

LTLf/LDLf non-markovian rewards

R Brafman, G De Giacomo, F Patrizi - Proceedings of the AAAI …, 2018 - ojs.aaai.org
Abstract In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian,
ie, depends on the last state and action. This dependency makes it difficult to reward more …

[LIVRE][B] Multi-objective decision making

DM Roijers, S Whiteson, R Brachman, P Stone - 2017 - Springer
Many real-world decision problems have multiple objectives. For example, when choosing a
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …

Pure-past linear temporal and dynamic logic on finite traces

G De Giacomo, A Di Stasio, F Fuggitti, S Rubin - IJCAI, 2020 - iris.uniroma1.it
LTLf and LDLf are well-known logics on finite traces. We review PLTLf and PLDLf, their pure-
past versions. These are interpreted backward from the end of the trace towards the …

Neural ordinary differential equation control of dynamics on graphs

T Asikis, L Böttcher, N Antulov-Fantulin - Physical Review Research, 2022 - APS
We study the ability of neural networks to calculate feedback control signals that steer
trajectories of continuous-time nonlinear dynamical systems on graphs, which we represent …

Reinforcement learning for joint optimization of multiple rewards

M Agarwal, V Aggarwal - Journal of Machine Learning Research, 2023 - jmlr.org
Finding optimal policies which maximize long term rewards of Markov Decision Processes
requires the use of dynamic programming and backward induction to solve the Bellman …