Reflexion: Language agents with verbal reinforcement learning
Large language models (LLMs) have been increasingly used to interact with external
environments (eg, games, compilers, APIs) as goal-driven agents. However, it remains …
environments (eg, games, compilers, APIs) as goal-driven agents. However, it remains …
A survey of zero-shot generalisation in deep reinforcement learning
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …
produce RL algorithms whose policies generalise well to novel unseen situations at …
Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark
Artificial agents have traditionally been trained to maximize reward, which may incentivize
power-seeking and deception, analogous to how next-token prediction in language models …
power-seeking and deception, analogous to how next-token prediction in language models …
Agentbench: Evaluating llms as agents
Large Language Models (LLMs) are becoming increasingly smart and autonomous,
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …
Grounding large language models in interactive environments with online reinforcement learning
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture
abstract knowledge about world's physics to solve decision-making problems. Yet, the …
abstract knowledge about world's physics to solve decision-making problems. Yet, the …