The rise and potential of large language model based agents: A survey

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

Self-refine: Iterative refinement with self-feedback

A Madaan, N Tandon, P Gupta… - Advances in …, 2023 - proceedings.neurips.cc
Like humans, large language models (LLMs) do not always generate the best output on their
first try. Motivated by how humans refine their written text, we introduce Self-Refine, an …

Opencodeinterpreter: Integrating code generation with execution and refinement

T Zheng, G Zhang, T Shen, X Liu, BY Lin, J Fu… - arxiv preprint arxiv …, 2024 - arxiv.org
The introduction of large language models has significantly advanced code generation.
However, open-source models often lack the execution capabilities and iterative refinement …

Refiner: Reasoning feedback on intermediate representations

D Paul, M Ismayilzada, M Peyrard, B Borges… - arxiv preprint arxiv …, 2023 - arxiv.org
Language models (LMs) have recently shown remarkable performance on reasoning tasks
by explicitly generating intermediate inferences, eg, chain-of-thought prompting. However …

Bridging the gap: A survey on integrating (human) feedback for natural language generation

P Fernandes, A Madaan, E Liu, A Farinhas… - Transactions of the …, 2023 - direct.mit.edu
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …

Rl4f: Generating natural language feedback with reinforcement learning for repairing model outputs

AF Akyürek, E Akyürek, A Madaan, A Kalyan… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite their unprecedented success, even the largest language models make mistakes.
Similar to how humans learn and improve using feedback, previous work proposed …

Memory-assisted prompt editing to improve GPT-3 after deployment

A Madaan, N Tandon, P Clark, Y Yang - arxiv preprint arxiv:2201.06009, 2022 - arxiv.org
Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to
humans. For example, GPT-3 would mistakenly interpret" What word is similar to good?" to …

Could ChatGPT get an engineering degree? Evaluating higher education vulnerability to AI assistants

B Borges, N Foroutan, D Bayazit, A Sotnikova… - Proceedings of the …, 2024 - pnas.org
AI assistants, such as ChatGPT, are being increasingly used by students in higher education
institutions. While these tools provide opportunities for improved teaching and education …

Editing common sense in transformers

A Gupta, D Mondal, AK Sheshadri, W Zhao… - arxiv preprint arxiv …, 2023 - arxiv.org
Editing model parameters directly in Transformers makes updating open-source transformer-
based models possible without re-training (Meng et al., 2023). However, these editing …

Improving grounded language understanding in a collaborative environment by interacting with agents through help feedback

N Mehta, M Teruel, PF Sanz, X Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
Many approaches to Natural Language Processing (NLP) tasks often treat them as single-
step problems, where an agent receives an instruction, executes it, and is evaluated based …