Reward machines: Exploiting reward function structure in reinforcement learning

RT Icarte, TQ Klassen, R Valenzano… - Journal of Artificial …, 2022 - jair.org
Reinforcement learning (RL) methods usually treat reward functions as black boxes. As
such, these methods must extensively interact with the environment in order to discover …

Lifelong reinforcement learning with temporal logic formulas and reward machines

X Zheng, C Yu, M Zhang - Knowledge-Based Systems, 2022 - Elsevier
Continuously learning new tasks using high-level ideas or knowledge is a key capability of
humans. In this paper, we propose lifelong reinforcement learning with sequential linear …

Multi-Agent Reinforcement Learning with a Hierarchy of Reward Machines

X Zheng, C Yu - arxiv preprint arxiv:2403.07005, 2024 - arxiv.org
In this paper, we study the cooperative Multi-Agent Reinforcement Learning (MARL)
problems using Reward Machines (RMs) to specify the reward functions such that the prior …

Learning Temporal Task Specifications From Demonstrations

M Baert, S Leroux, P Simoens - International Workshop on Explainable …, 2024 - Springer
As we progress towards real-world deployment, the critical need for interpretability in
reinforcement learning algorithms grows more pivotal, ensuring the safety and reliability of …

A simple approach to continual learning by transferring skill parameters

KR Zentner, R Julian, U Puri, Y Zhang… - arxiv preprint arxiv …, 2021 - arxiv.org
In order to be effective general purpose machines in real world environments, robots not
only will need to adapt their existing manipulation skills to new circumstances, they will need …

Reward Machines

RAT Icarte - 2022 - search.proquest.com
Reinforcement learning involves the study of how to solve sequential decision-making
problems using minimal supervision or prior knowledge. In contrast to most methods for …

Sparsedice: Imitation learning for temporally sparse data via regularization

A Camacho, I Gur, ML Moczulski… - ICML 2021 Workshop …, 2021 - openreview.net
Imitation learning learns how to act by observing the behavior of an expert demonstrator. We
are concerned with a setting where the demonstrations comprise only a subset of state …

Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information

J Li, X Zhan, Z **ao, G Zhou - arxiv preprint arxiv:2110.10905, 2021 - arxiv.org
End-to-end learning robotic manipulation with high data efficiency is one of the key
challenges in robotics. The latest methods that utilize human demonstration data and …

Reward Machines

RA Toro Icarte - 2022 - tspace.library.utoronto.ca
Reinforcement learning involves the study of how to solve sequential decision-making
problems using minimal supervision or prior knowledge. In contrast to most methods for …

Manipulator Reinforcement Learning with Mask Processing Based on Residual Network

X Wang, W Wang, R Li, H Jiang… - 2023 35th Chinese …, 2023 - ieeexplore.ieee.org
In the field of intelligent manufacturing, manipulators are expected to have a higher level of
learning ability to master skills. In this paper, the method of visual support is adopted to …