[PDF][PDF] LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning.

A Camacho, RT Icarte, TQ Klassen, RA Valenzano… - IJCAI, 2019 - ijcai.org
Abstract In Reinforcement Learning (RL), an agent is guided by the rewards it receives from
the reward function. Unfortunately, it may take many interactions with the environment to …

[PDF][PDF] Teaching multiple tasks to an RL agent using LTL

R Toro Icarte, TQ Klassen, R Valenzano… - Proceedings of the 17th …, 2018 - ifaamas.org
Reinforcement Learning (RL) algorithms are capable of learning effective behaviours
through trial and error interactions with their environment [40]. The recent combination of …

Foundations for restraining bolts: Reinforcement learning with LTLf/LDLf restraining specifications

G De Giacomo, L Iocchi, M Favorito… - Proceedings of the …, 2019 - ojs.aaai.org
In this work we investigate on the concept of “restraining bolt”, envisioned in Science Fiction.
Specifically we introduce a novel problem in AI. We have two distinct sets of features …

Reinforcement learning with non-markovian rewards

M Gaon, R Brafman - Proceedings of the AAAI conference on artificial …, 2020 - ojs.aaai.org
The standard RL world model is that of a Markov Decision Process (MDP). A basic premise
of MDPs is that the rewards depend on the last state and action only. Yet, many real-world …

A formal methods approach to interpretable reinforcement learning for robotic planning

X Li, Z Serlin, G Yang, C Belta - Science Robotics, 2019 - science.org
Growing interest in reinforcement learning approaches to robotic planning and control raises
concerns of predictability and safety of robot behaviors realized solely through learned …

LTLf/LDLf non-markovian rewards

R Brafman, G De Giacomo, F Patrizi - Proceedings of the AAAI …, 2018 - ojs.aaai.org
Abstract In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian,
ie, depends on the last state and action. This dependency makes it difficult to reward more …

[BUKU][B] Multi-objective decision making

DM Roijers, S Whiteson, R Brachman, P Stone - 2017 - Springer
Many real-world decision problems have multiple objectives. For example, when choosing a
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …

Pure-past linear temporal and dynamic logic on finite traces

G De Giacomo, A Di Stasio, F Fuggitti, S Rubin - IJCAI, 2020 - iris.uniroma1.it
LTLf and LDLf are well-known logics on finite traces. We review PLTLf and PLDLf, their pure-
past versions. These are interpreted backward from the end of the trace towards the …

Practical solution techniques for first-order MDPs

S Sanner, C Boutilier - Artificial Intelligence, 2009 - Elsevier
Many traditional solution approaches to relationally specified decision-theoretic planning
problems (eg, those stated in the probabilistic planning domain description language, or …

Reinforcement learning for joint optimization of multiple rewards

M Agarwal, V Aggarwal - Journal of Machine Learning Research, 2023 - jmlr.org
Finding optimal policies which maximize long term rewards of Markov Decision Processes
requires the use of dynamic programming and backward induction to solve the Bellman …