Bigger, better, faster: Human-level atari with human-level efficiency
We introduce a value-based RL agent, which we call BBF, that achieves super-human
performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used …
performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used …
Loss of plasticity in deep continual learning
Artificial neural networks, deep-learning methods and the backpropagation algorithm form
the foundation of modern machine learning and artificial intelligence. These methods are …
the foundation of modern machine learning and artificial intelligence. These methods are …
Loss of plasticity in continual deep reinforcement learning
In this paper, we characterize the behavior of canonical value-based deep reinforcement
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …
Plastic: Improving input and label plasticity for sample efficient reinforcement learning
Abstract In Reinforcement Learning (RL), enhancing sample efficiency is crucial, particularly
in scenarios when data acquisition is costly and risky. In principle, off-policy RL algorithms …
in scenarios when data acquisition is costly and risky. In principle, off-policy RL algorithms …
Deep reinforcement learning with plasticity injection
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …
Maintaining plasticity in deep continual learning
Modern deep-learning systems are specialized to problem settings in which training occurs
once and then never again, as opposed to continual-learning settings in which training …
once and then never again, as opposed to continual-learning settings in which training …
Stop regressing: Training value functions via classification for scalable deep rl
Value functions are a central component of deep reinforcement learning (RL). These
functions, parameterized by neural networks, are trained using a mean squared error …
functions, parameterized by neural networks, are trained using a mean squared error …
Maintaining plasticity via regenerative regularization
In continual learning, plasticity refers to the ability of an agent to quickly adapt to new
information. Neural networks are known to lose plasticity when processing non-stationary …
information. Neural networks are known to lose plasticity when processing non-stationary …
Overestimation, overfitting, and plasticity in actor-critic: the bitter lesson of reinforcement learning
Recent advancements in off-policy Reinforcement Learning (RL) have significantly improved
sample efficiency, primarily due to the incorporation of various forms of regularization that …
sample efficiency, primarily due to the incorporation of various forms of regularization that …
Drm: Mastering visual reinforcement learning through dormant ratio minimization
Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …