Motif: Intrinsic motivation from artificial intelligence feedback

M Klissarov, P D'Oro, S Sodhani, R Raileanu… - arxiv preprint arxiv …, 2023 - arxiv.org
Exploring rich environments and evaluating one's actions without prior knowledge is
immensely challenging. In this paper, we propose Motif, a general method to interface such …

A survey of temporal credit assignment in deep reinforcement learning

E Pignatelli, J Ferret, M Geist, T Mesnard… - arxiv preprint arxiv …, 2023 - arxiv.org
The Credit Assignment Problem (CAP) refers to the longstanding challenge of
Reinforcement Learning (RL) agents to associate actions with their long-term …

Discerning temporal difference learning

J Ma - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Temporal difference learning (TD) is a foundational concept in reinforcement learning (RL),
aimed at efficiently assessing a policy's value function. TD (λ), a potent variant, incorporates …