Exploring data aggregation in policy learning for vision-based urban autonomous driving
Data aggregation techniques can significantly improve vision-based policy learning within a
training environment, eg, learning to drive in a specific simulation condition. However, as on …
training environment, eg, learning to drive in a specific simulation condition. However, as on …
Continual reinforcement learning with complex synapses
Unlike humans, who are capable of continual learning over their lifetimes, artificial neural
networks have long been known to suffer from a phenomenon known as catastrophic …
networks have long been known to suffer from a phenomenon known as catastrophic …
Policy consolidation for continual reinforcement learning
We propose a method for tackling catastrophic forgetting in deep reinforcement learning that
is\textit {agnostic} to the timescale of changes in the distribution of experiences, does not …
is\textit {agnostic} to the timescale of changes in the distribution of experiences, does not …
Symbolic regression methods for reinforcement learning
Reinforcement learning algorithms can solve dynamic decision-making and optimal control
problems. With continuous-valued state and input variables, reinforcement learning …
problems. With continuous-valued state and input variables, reinforcement learning …
Continual reinforcement learning with multi-timescale replay
In this paper, we propose a multi-timescale replay (MTR) buffer for improving continual
learning in RL agents faced with environments that are changing continuously over time at …
learning in RL agents faced with environments that are changing continuously over time at …
[PDF][PDF] A framework of dual replay buffer: balancing forgetting and generalization in reinforcement learning
Experience replay buffer improves sample efficiency and training stabilization for recent
deep reinforcement learning (DRL) methods. However, for the first-in-first-out (FIFO) …
deep reinforcement learning (DRL) methods. However, for the first-in-first-out (FIFO) …
Optimal control via reinforcement learning with symbolic policy approximation
Abstract Model-based reinforcement learning (RL) algorithms can be used to derive optimal
control laws for nonlinear dynamic systems. With continuous-valued state and input …
control laws for nonlinear dynamic systems. With continuous-valued state and input …
Double Replay Buffers with Restricted Gradient
L Zhang, Z Zhang - International Conference on Neural Information …, 2020 - Springer
In this paper we consider the problem of how to balance exploration and exploitation in
deep reinforcement learning (DRL). We propose a generative method called double replay …
deep reinforcement learning (DRL). We propose a generative method called double replay …
Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness
R Unno, Y Tsuruoka - openreview.net
In off-policy reinforcement learning, an agent collects transition data (aka experience tuples)
from the environment and stores them in a replay buffer for the incoming parameter updates …
from the environment and stores them in a replay buffer for the incoming parameter updates …