Large language monkeys: Scaling inference compute with repeated sampling

B Brown, J Juravsky, R Ehrlich, R Clark, QV Le… - arxiv preprint arxiv …, 2024 - arxiv.org
Scaling the amount of compute used to train language models has dramatically improved
their capabilities. However, when it comes to inference, we often limit the amount of compute …

From llms to llm-based agents for software engineering: A survey of current, challenges and future

H **, L Huang, H Cai, J Yan, B Li, H Chen - arxiv preprint arxiv …, 2024 - arxiv.org
With the rise of large language models (LLMs), researchers are increasingly exploring their
applications in var ious vertical domains, such as software engineering. LLMs have …

Multi-agent software development through cross-team collaboration

Z Du, C Qian, W Liu, Z **e, Y Wang, Y Dang… - arxiv preprint arxiv …, 2024 - arxiv.org
The latest breakthroughs in Large Language Models (LLMs), eg., ChatDev, have catalyzed
profound transformations, particularly through multi-agent collaboration for software …

Inference-aware fine-tuning for best-of-n sampling in large language models

Y Chow, G Tennenholtz, I Gur, V Zhuang, B Dai… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent studies have indicated that effectively utilizing inference-time compute is crucial for
attaining better performance from large language models (LLMs). In this work, we propose a …

What did I do wrong? quantifying LLMs' sensitivity and consistency to prompt engineering

F Errica, G Siracusano, D Sanvito, R Bifulco - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) changed the way we design and interact with software
systems. Their ability to process and extract information from text has drastically improved …

G-designer: Architecting multi-agent communication topologies via graph neural networks

G Zhang, Y Yue, X Sun, G Wan, M Yu, J Fang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in large language model (LLM)-based agents have demonstrated
that collective intelligence can significantly surpass the capabilities of individual agents …

Prefclm: Enhancing preference-based reinforcement learning with crowdsourced large language models

R Wang, D Zhao, Z Yuan, I Obi… - IEEE Robotics and …, 2025 - ieeexplore.ieee.org
Preference-based reinforcement learning (PbRL) is emerging as a promising approach to
teaching robots through human comparative feedback without complex reward engineering …

A perspective for adapting generalist ai to specialized medical ai applications and their challenges

Z Wang, H Wang, B Danek, Y Li, C Mack… - arxiv preprint arxiv …, 2024 - arxiv.org
The integration of Large Language Models (LLMs) into medical applications has sparked
widespread interest across the healthcare industry, from drug discovery and development to …

Scaling large-language-model-based multi-agent collaboration

C Qian, Z **e, Y Wang, W Liu, Y Dang, Z Du… - arxiv preprint arxiv …, 2024 - arxiv.org
Pioneering advancements in large language model-powered agents have underscored the
design pattern of multi-agent collaboration, demonstrating that collective intelligence can …

Turn every application into an agent: Towards efficient human-agent-computer interaction with api-first llm-based agents

J Lu, Z Zhang, F Yang, J Zhang, L Wang, C Du… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) have enabled LLM-based agents to directly
interact with application user interfaces (UIs), enhancing agents' performance in complex …