Reflexion: Language agents with verbal reinforcement learning

N Shinn, F Cassano, A Gopinath… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) have been increasingly used to interact with external
environments (eg, games, compilers, APIs) as goal-driven agents. However, it remains …

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark

A Pan, JS Chan, A Zou, N Li, S Basart… - International …, 2023 - proceedings.mlr.press
Artificial agents have traditionally been trained to maximize reward, which may incentivize
power-seeking and deception, analogous to how next-token prediction in language models …

Agentbench: Evaluating llms as agents

X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) are becoming increasingly smart and autonomous,
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …

Grounding large language models in interactive environments with online reinforcement learning

T Carta, C Romac, T Wolf, S Lamprier… - International …, 2023 - proceedings.mlr.press
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture
abstract knowledge about world's physics to solve decision-making problems. Yet, the …