Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arxiv preprint arxiv …, 2024 - arxiv.org
The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …

Unifying the perspectives of nlp and software engineering: A survey on language models for code

Z Zhang, C Chen, B Liu, C Liao, Z Gong, H Yu… - arxiv preprint arxiv …, 2023 - arxiv.org
In this work we systematically review the recent advancements in software engineering with
language models, covering 70+ models, 40+ evaluation tasks, 180+ datasets, and 900 …

Codeagent: Enhancing code generation with tool-integrated agent systems for real-world repo-level coding challenges

K Zhang, J Li, G Li, X Shi, Z ** - arxiv preprint arxiv:2401.07339, 2024 - arxiv.org
Large Language Models (LLMs) have shown promise in automated code generation but
typically excel only in simpler tasks such as generating standalone code units. Real-world …

Rlcoder: Reinforcement learning for repository-level code completion

Y Wang, Y Wang, D Guo, J Chen, R Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Repository-level code completion aims to generate code for unfinished code snippets within
the context of a specified repository. Existing approaches mainly rely on retrieval-augmented …

Automatic programming: Large language models and beyond

MR Lyu, B Ray, A Roychoudhury, SH Tan… - ACM Transactions on …, 2024 - dl.acm.org
Automatic programming has seen increasing popularity due to the emergence of tools like
GitHub Copilot which rely on Large Language Models (LLMs). At the same time …

Iterative refinement of project-level code context for precise code generation with compiler feedback

Z Bi, Y Wan, Z Wang, H Zhang, B Guan, F Lu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have shown remarkable progress in automated code
generation. Yet, LLM-generated code may contain errors in API usage, class, data structure …

Agents in software engineering: Survey, landscape, and vision

Y Wang, W Zhong, Y Huang, E Shi, M Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, Large Language Models (LLMs) have achieved remarkable success and
have been widely used in various downstream tasks, especially in the tasks of the software …

Graphcoder: Enhancing repository-level code completion via code context graph-based retrieval and language model

W Liu, A Yu, D Zan, B Shen, W Zhang, H Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org
The performance of repository-level code completion depends upon the effective leverage of
both general and repository-specific knowledge. Despite the impressive capability of code …

Revisiting code similarity evaluation with abstract syntax tree edit distance

Y Song, C Lothritz, D Tang, TF Bissyandé… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper revisits recent code similarity evaluation metrics, particularly focusing on the
application of Abstract Syntax Tree (AST) editing distance in diverse programming …

A solution toward transparent and practical AI regulation: Privacy nutrition labels for open-source generative AI-based applications

M Si, S Pan, D Liao, X Sun, Z Tao, W Shi… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development and widespread adoption of Generative Artificial Intelligence-based
(GAI) applications have greatly enriched our daily lives, benefiting people by enhancing …