The rise and potential of large language model based agents: A survey
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …
human intelligence. AI agents, which are artificial entities capable of sensing the …
[HTML][HTML] AI deception: A survey of examples, risks, and potential solutions
This paper argues that a range of current AI systems have learned how to deceive humans.
We define deception as the systematic inducement of false beliefs in the pursuit of some …
We define deception as the systematic inducement of false beliefs in the pursuit of some …
Augmented language models: a survey
This survey reviews works in which language models (LMs) are augmented with reasoning
skills and the ability to use tools. The former is defined as decomposing a potentially …
skills and the ability to use tools. The former is defined as decomposing a potentially …
Metagpt: Meta programming for multi-agent collaborative framework
Recently, remarkable progress has been made in automated task-solving through the use of
multi-agent driven by large language models (LLMs). However, existing LLM-based multi …
multi-agent driven by large language models (LLMs). However, existing LLM-based multi …
Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark
Artificial agents have traditionally been trained to maximize reward, which may incentivize
power-seeking and deception, analogous to how next-token prediction in language models …
power-seeking and deception, analogous to how next-token prediction in language models …
Discovering latent knowledge in language models without supervision
Existing techniques for training language models can be misaligned with the truth: if we train
models with imitation learning, they may reproduce errors that humans make; if we train …
models with imitation learning, they may reproduce errors that humans make; if we train …
An overview of catastrophic ai risks
Rapid advancements in artificial intelligence (AI) have sparked growing concerns among
experts, policymakers, and world leaders regarding the potential for increasingly advanced …
experts, policymakers, and world leaders regarding the potential for increasingly advanced …
Can we edit factual knowledge by in-context learning?
Previous studies have shown that large language models (LLMs) like GPTs store massive
factual knowledge in their parameters. However, the stored knowledge could be false or out …
factual knowledge in their parameters. However, the stored knowledge could be false or out …
Ai alignment: A comprehensive survey
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …
The alignment problem from a deep learning perspective
In coming decades, artificial general intelligence (AGI) may surpass human capabilities at
many critical tasks. We argue that, without substantial effort to prevent it, AGIs could learn to …
many critical tasks. We argue that, without substantial effort to prevent it, AGIs could learn to …