Temporally-extended {\epsilon}-greedy exploration

W Dabney, G Ostrovski, A Barreto - arxiv preprint arxiv:2006.01782, 2020 - arxiv.org
Recent work on exploration in reinforcement learning (RL) has led to a series of increasingly
complex solutions to the problem. This increase in complexity often comes at the expense of …

Temporal abstraction in reinforcement learning with the successor representation

MC Machado, A Barreto, D Precup… - Journal of Machine …, 2023 - jmlr.org
Reasoning at multiple levels of temporal abstraction is one of the key attributes of
intelligence. In reinforcement learning, this is often modeled through temporally extended …

Flexible modulation of sequence generation in the entorhinal–hippocampal system

DC McNamee, KL Stachenfeld, MM Botvinick… - Nature …, 2021 - nature.com
Exploration, consolidation and planning depend on the generation of sequential state
representations. However, these algorithms require disparate forms of sampling dynamics …

Scalable multi-agent covering option discovery based on kronecker graphs

J Chen, J Chen, T Lan… - Advances in Neural …, 2022 - proceedings.neurips.cc
Covering option discovery has been developed to improve the exploration of RL in single-
agent scenarios with sparse reward signals, through connecting the most distant states in …

Flexible option learning

M Klissarov, D Precup - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Temporal abstraction in reinforcement learning (RL), offers the promise of improving
generalization and knowledge transfer in complex environments, by propagating information …

Skill discovery for exploration and planning using deep skill graphs

A Bagaria, JK Senthil… - … Conference on Machine …, 2021 - proceedings.mlr.press
We introduce a new skill-discovery algorithm that builds a discrete graph representation of
large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill …

Learning subgoal representations with slow dynamics

S Li, L Zheng, J Wang, C Zhang - International Conference on …, 2021 - openreview.net
In goal-conditioned Hierarchical Reinforcement Learning (HRL), a high-level policy
periodically sets subgoals for a low-level policy, and the low-level policy is trained to reach …

Deep laplacian-based options for temporally-extended exploration

M Klissarov, MC Machado - arxiv preprint arxiv:2301.11181, 2023 - arxiv.org
Selecting exploratory actions that generate a rich stream of experience for better learning is
a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem …

Exploration in reinforcement learning with deep covering options

Y **nai, JW Park, MC Machado… - … Conference on Learning …, 2020 - openreview.net
While many option discovery methods have been proposed to accelerate exploration in
reinforcement learning, they are often heuristic. Recently, covering options was proposed to …

Value preserving state-action abstractions

D Abel, N Umbanhowar, K Khetarpal… - International …, 2020 - proceedings.mlr.press
Abstraction can improve the sample efficiency of reinforcement learning. However, the
process of abstraction inherently discards information, potentially compromising an agent's …