Stop regressing: Training value functions via classification for scalable deep rl

J Farebrother, J Orbay, Q Vuong, AA Taïga… - arxiv preprint arxiv …, 2024 - arxiv.org
Value functions are a central component of deep reinforcement learning (RL). These
functions, parameterized by neural networks, are trained using a mean squared error …

Predictive representations: Building blocks of intelligence

W Carvalho, MS Tomov, W de Cothi, C Barry… - Neural …, 2024 - direct.mit.edu
Adaptive behavior often requires predicting future events. The theory of reinforcement
learning prescribes what kinds of predictive representations are useful and how to compute …

Reinforcement learning: An overview

K Murphy - arxiv preprint arxiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …

A distributional analogue to the successor representation

H Wiltzer, J Farebrother, A Gretton, Y Tang… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper contributes a new approach for distributional reinforcement learning which
elucidates a clean separation of transition structure and reward in the learning process …

Learning Successor Features the Simple Way

R Chua, A Ghosh, C Kaplanis, BA Richards… - arxiv preprint arxiv …, 2024 - arxiv.org
In Deep Reinforcement Learning (RL), it is a challenge to learn representations that do not
exhibit catastrophic forgetting or interference in non-stationary environments. Successor …

Incorporating human plausibility in single-and multi-agent AI systems

SA Barnett - 2024 - search.proquest.com
As AI systems play a progressively larger role in human affairs, it becomes more important
that these systems are built with insights from human behavior. In particular, models that are …