A survey on deep learning for software engineering

Y Yang, X **a, D Lo, J Grundy - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
In 2006, Geoffrey Hinton proposed the concept of training “Deep Neural Networks (DNNs)”
and an improved model training method to break the bottleneck of neural network …

A survey of machine learning for big code and naturalness

M Allamanis, ET Barr, P Devanbu… - ACM Computing Surveys …, 2018 - dl.acm.org
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …

Lever: Learning to verify language-to-code generation with execution

A Ni, S Iyer, D Radev, V Stoyanov… - International …, 2023 - proceedings.mlr.press
The advent of large language models trained on code (code LLMs) has led to significant
progress in language-to-code generation. State-of-the-art approaches in this area combine …

Self-planning code generation with large language models

X Jiang, Y Dong, L Wang, Z Fang, Q Shang… - ACM Transactions on …, 2024 - dl.acm.org
Although large language models (LLMs) have demonstrated impressive ability in code
generation, they are still struggling to address the complicated intent provided by humans. It …

Ai-generated content (aigc): A survey

J Wu, W Gan, Z Chen, S Wan, H Lin - arxiv preprint arxiv:2304.06632, 2023 - arxiv.org
To address the challenges of digital intelligence in the digital economy, artificial intelligence-
generated content (AIGC) has emerged. AIGC uses artificial intelligence to assist or replace …

Graphcodebert: Pre-training code representations with data flow

D Guo, S Ren, S Lu, Z Feng, D Tang, S Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Pre-trained models for programming language have achieved dramatic empirical
improvements on a variety of code-related tasks such as code search, code completion …

Codebleu: a method for automatic evaluation of code synthesis

S Ren, D Guo, S Lu, L Zhou, S Liu, D Tang… - arxiv preprint arxiv …, 2020 - arxiv.org
Evaluation metrics play a vital role in the growth of an area as it defines the standard of
distinguishing between good and bad models. In the area of code synthesis, the commonly …

A syntax-guided edit decoder for neural program repair

Q Zhu, Z Sun, Y **ao, W Zhang, K Yuan… - Proceedings of the 29th …, 2021 - dl.acm.org
Automated Program Repair (APR) helps improve the efficiency of software development and
maintenance. Recent APR techniques use deep learning, particularly the encoder-decoder …

Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task

T Yu, R Zhang, K Yang, M Yasunaga, D Wang… - arxiv preprint arxiv …, 2018 - arxiv.org
We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-
SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 …

code2seq: Generating sequences from structured representations of code

U Alon, S Brody, O Levy, E Yahav - arxiv preprint arxiv:1808.01400, 2018 - arxiv.org
The ability to generate natural language sequences from source code snippets has a variety
of applications such as code summarization, documentation, and retrieval. Sequence-to …