[PDF][PDF] LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning.
Abstract In Reinforcement Learning (RL), an agent is guided by the rewards it receives from
the reward function. Unfortunately, it may take many interactions with the environment to …
the reward function. Unfortunately, it may take many interactions with the environment to …
[PDF][PDF] Teaching multiple tasks to an RL agent using LTL
Reinforcement Learning (RL) algorithms are capable of learning effective behaviours
through trial and error interactions with their environment [40]. The recent combination of …
through trial and error interactions with their environment [40]. The recent combination of …
Foundations for restraining bolts: Reinforcement learning with LTLf/LDLf restraining specifications
In this work we investigate on the concept of “restraining bolt”, envisioned in Science Fiction.
Specifically we introduce a novel problem in AI. We have two distinct sets of features …
Specifically we introduce a novel problem in AI. We have two distinct sets of features …
Reinforcement learning with non-markovian rewards
The standard RL world model is that of a Markov Decision Process (MDP). A basic premise
of MDPs is that the rewards depend on the last state and action only. Yet, many real-world …
of MDPs is that the rewards depend on the last state and action only. Yet, many real-world …
A formal methods approach to interpretable reinforcement learning for robotic planning
Growing interest in reinforcement learning approaches to robotic planning and control raises
concerns of predictability and safety of robot behaviors realized solely through learned …
concerns of predictability and safety of robot behaviors realized solely through learned …
LTLf/LDLf non-markovian rewards
Abstract In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian,
ie, depends on the last state and action. This dependency makes it difficult to reward more …
ie, depends on the last state and action. This dependency makes it difficult to reward more …
[BUKU][B] Multi-objective decision making
Many real-world decision problems have multiple objectives. For example, when choosing a
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …
Pure-past linear temporal and dynamic logic on finite traces
LTLf and LDLf are well-known logics on finite traces. We review PLTLf and PLDLf, their pure-
past versions. These are interpreted backward from the end of the trace towards the …
past versions. These are interpreted backward from the end of the trace towards the …
Practical solution techniques for first-order MDPs
Many traditional solution approaches to relationally specified decision-theoretic planning
problems (eg, those stated in the probabilistic planning domain description language, or …
problems (eg, those stated in the probabilistic planning domain description language, or …
Reinforcement learning for joint optimization of multiple rewards
Finding optimal policies which maximize long term rewards of Markov Decision Processes
requires the use of dynamic programming and backward induction to solve the Bellman …
requires the use of dynamic programming and backward induction to solve the Bellman …