Tool learning with foundation models

Y Qin, S Hu, Y Lin, W Chen, N Ding, G Cui… - ACM Computing …, 2024‏ - dl.acm.org
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …

Repoagent: An llm-powered open-source framework for repository-level code documentation generation

Q Luo, Y Ye, S Liang, Z Zhang, Y Qin, Y Lu… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Generative models have demonstrated considerable potential in software engineering,
particularly in tasks such as code generation and debugging. However, their utilization in the …

Knowagent: Knowledge-augmented planning for llm-based agents

Y Zhu, S Qiao, Y Ou, S Deng, N Zhang, S Lyu… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large Language Models (LLMs) have demonstrated great potential in complex reasoning
tasks, yet they fall short when tackling more sophisticated challenges, especially when …

Automating the enterprise with foundation models

M Wornow, A Narayan, K Opsahl-Ong… - Proceedings of the …, 2024‏ - dl.acm.org
Automating enterprise workflows could unlock $4 trillion/year in productivity gains. Despite
being of interest to the data management community for decades, the ultimate vision of end …

Screenagent: A vision language model-driven computer control agent

R Niu, J Li, S Wang, Y Fu, X Hu, X Leng, H Kong… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Existing Large Language Models (LLM) can invoke a variety of tools and APIs to complete
complex tasks. The computer, as the most powerful and universal tool, could potentially be …

Agent planning with world knowledge model

S Qiao, R Fang, N Zhang, Y Zhu… - Advances in …, 2025‏ - proceedings.neurips.cc
Recent endeavors towards directly using large language models (LLMs) as agent models to
execute interactive planning tasks have shown commendable results. Despite their …

Large language model-brained gui agents: A survey

C Zhang, S He, J Qian, B Li, L Li, S Qin, Y Kang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
GUIs have long been central to human-computer interaction, providing an intuitive and
visually-driven way to access and interact with digital systems. The advent of LLMs …

ProcessCarbonAgent: A large language models-empowered autonomous agent for decision-making in manufacturing carbon emission management

T Wu, J Li, J Bao, Q Liu - Journal of Manufacturing Systems, 2024‏ - Elsevier
Abstract Knowledge-intensive production represents a primary trend in industrial
manufacturing, which heavily relies on the production logs of large-scale, historically similar …

Towards responsible generative ai: A reference architecture for designing foundation model based agents

Q Lu, L Zhu, X Xu, Z **ng, S Harrer… - 2024 IEEE 21st …, 2024‏ - ieeexplore.ieee.org
Foundation models, such as large language models (LLMs), have been widely recognised
as transformative AI technologies due to their capabilities to understand and generate …

Flowbench: Revisiting and benchmarking workflow-guided planning for llm-based agents

R **ao, W Ma, K Wang, Y Wu, J Zhao, H Wang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
LLM-based agents have emerged as promising tools, which are crafted to fulfill complex
tasks by iterative planning and action. However, these agents are susceptible to undesired …