Intelligent problem-solving as integrated hierarchical reinforcement learning

M Eppe, C Gumbsch, M Kerzel, PDH Nguyen… - Nature Machine …, 2022 - nature.com
According to cognitive psychology and related disciplines, the development of complex
problem-solving behaviour in biological agents depends on hierarchical cognitive …

Learning by playing solving sparse reward tasks from scratch

M Riedmiller, R Hafner, T Lampe… - International …, 2018 - proceedings.mlr.press
Abstract We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the
context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors-from …

Universal value function approximators

T Schaul, D Horgan, K Gregor… - … conference on machine …, 2015 - proceedings.mlr.press
Value functions are a core component of reinforcement learning. The main idea is to to
construct a single function approximator V (s; theta) that estimates the long-term reward from …

The predictron: End-to-end learning and planning

D Silver, H Hasselt, M Hessel… - International …, 2017 - proceedings.mlr.press
One of the key challenges of artificial intelligence is to learn models that are effective in the
context of planning. In this document we introduce the predictron architecture. The …

Develo** a predictive approach to knowledge

A White - 2015 - era.library.ualberta.ca
Understanding how an artificial agent may represent, acquire, update, and use large
amounts of knowledge has long been an important research challenge in artificial …

Importance resampling for off-policy prediction

M Schlegel, W Chung, D Graves… - Advances in Neural …, 2019 - proceedings.neurips.cc
Importance sampling (IS) is a common reweighting strategy for off-policy prediction in
reinforcement learning. While it is consistent and unbiased, it can result in high variance …

MHER: Model-based hindsight experience replay

R Yang, M Fang, L Han, Y Du, F Luo, X Li - arxiv preprint arxiv …, 2021 - arxiv.org
Solving multi-goal reinforcement learning (RL) problems with sparse rewards is generally
challenging. Existing approaches have utilized goal relabeling on collected experiences to …

General value function networks

M Schlegel, A Jacobsen, Z Abbas, A Patterson… - Journal of Artificial …, 2021 - jair.org
State construction is important for learning in partially observable environments. A general
purpose strategy for state construction is to learn the state update using a Recurrent Neural …

Effectively learning initiation sets in hierarchical reinforcement learning

A Bagaria, B Abbatematteo… - Advances in …, 2023 - proceedings.neurips.cc
An agent learning an option in hierarchical reinforcement learning must solve three
problems: identify the option's subgoal (termination condition), learn a policy, and learn …

Hierarchical principles of embodied reinforcement learning: A review

M Eppe, C Gumbsch, M Kerzel, PDH Nguyen… - arxiv preprint arxiv …, 2020 - arxiv.org
Cognitive Psychology and related disciplines have identified several critical mechanisms
that enable intelligent biological agents to learn to solve complex problems. There exists …