The rise and potential of large language model based agents: A survey

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org
In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Leveraging procedural generation to benchmark reinforcement learning

K Cobbe, C Hesse, J Hilton… - … conference on machine …, 2020 - proceedings.mlr.press
Abstract We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like
environments designed to benchmark both sample efficiency and generalization in …

Unity: A general platform for intelligent agents

A Juliani, VP Berges, E Teng, A Cohen… - arxiv preprint arxiv …, 2018 - arxiv.org
Recent advances in artificial intelligence have been driven by the presence of increasingly
realistic and complex simulated environments. However, many of the existing environments …

[PDF][PDF] On the measure of intelligence

F Chollet - arxiv preprint arxiv:1911.01547, 2019 - juanmirod.github.io
To make deliberate progress towards more intelligent and more human-like artificial
systems, we need to be following an appropriate feedback signal: we need to be able to …

Quantifying generalization in reinforcement learning

K Cobbe, O Klimov, C Hesse, T Kim… - … on machine learning, 2019 - proceedings.mlr.press
In this paper, we investigate the problem of overfitting in deep reinforcement learning.
Among the most common benchmarks in RL, it is customary to use the same environments …

Maximum entropy RL (provably) solves some robust RL problems

B Eysenbach, S Levine - arxiv preprint arxiv:2103.06257, 2021 - arxiv.org
Many potential applications of reinforcement learning (RL) require guarantees that the agent
will perform well in the face of disturbances to the dynamics or reward function. In this paper …

Contrastive behavioral similarity embeddings for generalization in reinforcement learning

R Agarwal, MC Machado, PS Castro… - arxiv preprint arxiv …, 2021 - arxiv.org
Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arxiv preprint arxiv:2110.05038, 2021 - arxiv.org
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

The nethack learning environment

H Küttler, N Nardelli, A Miller… - Advances in …, 2020 - proceedings.neurips.cc
Abstract Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the
development of challenging environments that test the limits of current methods. While …