Agentless: Demystifying llm-based software engineering agents

CS **a, Y Deng, S Dunn, L Zhang - arxiv preprint arxiv:2407.01489, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have significantly advanced the
automation of software development tasks, including code synthesis, program repair, and …

Large language model supply chain: A research agenda

S Wang, Y Zhao, X Hou, H Wang - ACM Transactions on Software …, 2024 - dl.acm.org
The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …

Agent-as-a-judge: Evaluate agents with agents

M Zhuge, C Zhao, D Ashley, W Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Contemporary evaluation techniques are inadequate for agentic systems. These
approaches either focus exclusively on final outcomes--ignoring the step-by-step nature of …

Security of Language Models for Code: A Systematic Literature Review

Y Chen, W Sun, C Fang, Z Chen, Y Ge, T Han… - arxiv preprint arxiv …, 2024 - arxiv.org
Language models for code (CodeLMs) have emerged as powerful tools for code-related
tasks, outperforming traditional methods and standard machine learning approaches …

Large language model-brained gui agents: A survey

C Zhang, S He, J Qian, B Li, L Li, S Qin, Y Kang… - arxiv preprint arxiv …, 2024 - arxiv.org
GUIs have long been central to human-computer interaction, providing an intuitive and
visually-driven way to access and interact with digital systems. The advent of LLMs …

Selfcodealign: Self-alignment for code generation

Y Wei, F Cassano, J Liu, Y Ding, N Jain… - arxiv preprint arxiv …, 2024 - arxiv.org
Instruction tuning is a supervised fine-tuning approach that significantly improves the ability
of large language models (LLMs) to follow human instructions. We propose SelfCodeAlign …

Do advanced language models eliminate the need for prompt engineering in software engineering?

G Wang, Z Sun, Z Gong, S Ye, Y Chen, Y Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have significantly advanced software engineering (SE)
tasks, with prompt engineering techniques enhancing their performance in code-related …

Towards trustworthy llms for code: A data-centric synergistic auditing framework

C Wang, Z Chen, T Li, Y Zhao, Y Liu - arxiv preprint arxiv:2410.09048, 2024 - arxiv.org
LLM-powered coding and development assistants have become prevalent to programmers'
workflows. However, concerns about the trustworthiness of LLMs for code persist despite …

Language models for code optimization: Survey, challenges and future directions

J Gong, V Voskanyan, P Brookes, F Wu, W Jie… - arxiv preprint arxiv …, 2025 - arxiv.org
Language models (LMs) built upon deep neural networks (DNNs) have recently
demonstrated breakthrough effectiveness in software engineering tasks like code …

From text to test: AI-generated control software for materials science instruments

D Fébba, K Egbo, WA Callahan, A Zakutayev - Digital Discovery, 2025 - pubs.rsc.org
Large language models (LLMs) are one of the AI technologies that are transforming the
landscape of chemistry and materials science. Recent examples of LLM-accelerated …