Exploring data aggregation in policy learning for vision-based urban autonomous driving

A Prakash, A Behl, E Ohn-Bar… - Proceedings of the …, 2020 - openaccess.thecvf.com
Data aggregation techniques can significantly improve vision-based policy learning within a
training environment, eg, learning to drive in a specific simulation condition. However, as on …

Continual reinforcement learning with complex synapses

C Kaplanis, M Shanahan… - … Conference on Machine …, 2018 - proceedings.mlr.press
Unlike humans, who are capable of continual learning over their lifetimes, artificial neural
networks have long been known to suffer from a phenomenon known as catastrophic …

Policy consolidation for continual reinforcement learning

C Kaplanis, M Shanahan, C Clopath - arxiv preprint arxiv:1902.00255, 2019 - arxiv.org
We propose a method for tackling catastrophic forgetting in deep reinforcement learning that
is\textit {agnostic} to the timescale of changes in the distribution of experiences, does not …

Symbolic regression methods for reinforcement learning

J Kubalík, E Derner, J Žegklitz, R Babuška - IEEE Access, 2021 - ieeexplore.ieee.org
Reinforcement learning algorithms can solve dynamic decision-making and optimal control
problems. With continuous-valued state and input variables, reinforcement learning …

Continual reinforcement learning with multi-timescale replay

C Kaplanis, C Clopath, M Shanahan - arxiv preprint arxiv:2004.07530, 2020 - arxiv.org
In this paper, we propose a multi-timescale replay (MTR) buffer for improving continual
learning in RL agents faced with environments that are changing continuously over time at …

[PDF][PDF] A framework of dual replay buffer: balancing forgetting and generalization in reinforcement learning

L Zhang, Z Zhang, Z Pan, Y Chen, J Zhu… - Proceedings of the 2nd …, 2019 - surl.tirl.info
Experience replay buffer improves sample efficiency and training stabilization for recent
deep reinforcement learning (DRL) methods. However, for the first-in-first-out (FIFO) …

Optimal control via reinforcement learning with symbolic policy approximation

J Kubalík, E Alibekov, R Babuška - IFAC-PapersOnLine, 2017 - Elsevier
Abstract Model-based reinforcement learning (RL) algorithms can be used to derive optimal
control laws for nonlinear dynamic systems. With continuous-valued state and input …

Double Replay Buffers with Restricted Gradient

L Zhang, Z Zhang - International Conference on Neural Information …, 2020 - Springer
In this paper we consider the problem of how to balance exploration and exploitation in
deep reinforcement learning (DRL). We propose a generative method called double replay …

Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness

R Unno, Y Tsuruoka - openreview.net
In off-policy reinforcement learning, an agent collects transition data (aka experience tuples)
from the environment and stores them in a replay buffer for the incoming parameter updates …