Bigger, better, faster: Human-level atari with human-level efficiency

M Schwarzer, JSO Ceron, A Courville… - International …, 2023 - proceedings.mlr.press
We introduce a value-based RL agent, which we call BBF, that achieves super-human
performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used …

Loss of plasticity in deep continual learning

S Dohare, JF Hernandez-Garcia, Q Lan, P Rahman… - Nature, 2024 - nature.com
Artificial neural networks, deep-learning methods and the backpropagation algorithm form
the foundation of modern machine learning and artificial intelligence. These methods are …

Loss of plasticity in continual deep reinforcement learning

Z Abbas, R Zhao, J Modayil, A White… - … on Lifelong Learning …, 2023 - proceedings.mlr.press
In this paper, we characterize the behavior of canonical value-based deep reinforcement
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …

Plastic: Improving input and label plasticity for sample efficient reinforcement learning

H Lee, H Cho, H Kim, D Gwak, J Kim… - Advances in …, 2024 - proceedings.neurips.cc
Abstract In Reinforcement Learning (RL), enhancing sample efficiency is crucial, particularly
in scenarios when data acquisition is costly and risky. In principle, off-policy RL algorithms …

Deep reinforcement learning with plasticity injection

E Nikishin, J Oh, G Ostrovski, C Lyle… - Advances in …, 2024 - proceedings.neurips.cc
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …

Maintaining plasticity in deep continual learning

S Dohare, JF Hernandez-Garcia, P Rahman… - arxiv preprint arxiv …, 2023 - arxiv.org
Modern deep-learning systems are specialized to problem settings in which training occurs
once and then never again, as opposed to continual-learning settings in which training …

Stop regressing: Training value functions via classification for scalable deep rl

J Farebrother, J Orbay, Q Vuong, AA Taïga… - arxiv preprint arxiv …, 2024 - arxiv.org
Value functions are a central component of deep reinforcement learning (RL). These
functions, parameterized by neural networks, are trained using a mean squared error …

Maintaining plasticity via regenerative regularization

S Kumar, H Marklund, B Van Roy - arxiv preprint arxiv:2308.11958, 2023 - arxiv.org
In continual learning, plasticity refers to the ability of an agent to quickly adapt to new
information. Neural networks are known to lose plasticity when processing non-stationary …

Overestimation, overfitting, and plasticity in actor-critic: the bitter lesson of reinforcement learning

M Nauman, M Bortkiewicz, P Miłoś, T Trzciński… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in off-policy Reinforcement Learning (RL) have significantly improved
sample efficiency, primarily due to the incorporation of various forms of regularization that …

Drm: Mastering visual reinforcement learning through dormant ratio minimization

G Xu, R Zheng, Y Liang, X Wang, Z Yuan, T Ji… - arxiv preprint arxiv …, 2023 - arxiv.org
Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …