Symbolic task inference in deep reinforcement learning

H Hasanbeig, NY Jeppu, A Abate, T Melham… - Journal of Artificial …, 2024 - jair.org
This paper proposes DeepSynth, a method for effective training of deep reinforcement
learning agents when the reward is sparse or non-Markovian, but at the same time progress …

Reinforcement learning with predefined and inferred reward machines in stochastic games

J Hu, Y Paliwal, H Kim, Y Wang, Z Xu - Neurocomputing, 2024 - Elsevier
This paper focuses on Multi-Agent Reinforcement Learning (MARL) in non-cooperative
stochastic games, particularly addressing the challenge of task completion characterized by …

Translating omega-regular specifications to average objectives for model-free reinforcement learning

M Kazemi, M Perez, F Somenzi, S Soudjani… - Proc. of the 21st …, 2022 - par.nsf.gov
Recent success in reinforcement learning (RL) has brought renewed attention to the design
of reward functions by which agent behavior is reinforced or deterred. Manually designing …

Regular Reinforcement Learning

T Dohmen, M Perez, F Somenzi, A Trivedi - International Conference on …, 2024 - Springer
In reinforcement learning, an agent incrementally refines a behavioral policy through a
series of episodic interactions with its environment. This process can be characterized as …

Hierarchies of reward machines

D Furelos-Blanco, M Law, A Jonsson… - International …, 2023 - proceedings.mlr.press
Reward machines (RMs) are a recent formalism for representing the reward function of a
reinforcement learning task through a finite-state machine whose edges encode subgoals of …

Inferring probabilistic reward machines from non-markovian reward signals for reinforcement learning

T Dohmen, N Topper, G Atia, A Beckus… - Proceedings of the …, 2022 - ojs.aaai.org
The success of reinforcement learning in typical settings is predicated on Markovian
assumptions on the reward signal by which an agent learns optimal policies. In recent years …

Learning task automata for reinforcement learning using hidden Markov models

A Abate, Y Almulla, J Fox, D Hyland, M Wooldridge - ECAI 2023, 2023 - ebooks.iospress.nl
Training reinforcement learning (RL) agents using scalar reward signals is often infeasible
when an environment has sparse and non-Markovian rewards. Moreover, handcrafting …

Learning Environment Models with Continuous Stochastic Dynamics

M Tappler, E Muškardin, BK Aichernig… - arxiv preprint arxiv …, 2023 - arxiv.org
Solving control tasks in complex environments automatically through learning offers great
potential. While contemporary techniques from deep reinforcement learning (DRL) provide …

Multi-Agent Reinforcement Learning with a Hierarchy of Reward Machines

X Zheng, C Yu - arxiv preprint arxiv:2403.07005, 2024 - arxiv.org
In this paper, we study the cooperative Multi-Agent Reinforcement Learning (MARL)
problems using Reward Machines (RMs) to specify the reward functions such that the prior …

Reinforcement learning under partial observability guided by learned environment models

E Muškardin, M Tappler, BK Aichernig, I Pill - International Conference on …, 2023 - Springer
Reinforcement learning and planning under partial observability is notoriously difficult. In
this setting, decision-making agents need to perform a sequence of actions with incomplete …