A survey on deep learning for software engineering

Y Yang, X **a, D Lo, J Grundy - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
In 2006, Geoffrey Hinton proposed the concept of training “Deep Neural Networks (DNNs)”
and an improved model training method to break the bottleneck of neural network …

Natural language generation and understanding of big code for AI-assisted programming: A review

MF Wong, S Guo, CN Hang, SW Ho, CW Tan - Entropy, 2023 - mdpi.com
This paper provides a comprehensive review of the literature concerning the utilization of
Natural Language Processing (NLP) techniques, with a particular focus on transformer …

Unified pre-training for program understanding and generation

WU Ahmad, S Chakraborty, B Ray… - arxiv preprint arxiv …, 2021 - arxiv.org
Code summarization and generation empower conversion between programming language
(PL) and natural language (NL), while code translation avails the migration of legacy code …

Codexglue: A machine learning benchmark dataset for code understanding and generation

S Lu, D Guo, S Ren, J Huang, A Svyatkovskiy… - arxiv preprint arxiv …, 2021 - arxiv.org
Benchmark datasets have a significant impact on accelerating research in programming
language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster …

Data quality matters: A case study on data label correctness for security bug report prediction

X Wu, W Zheng, X **a, D Lo - IEEE Transactions on Software …, 2021 - ieeexplore.ieee.org
In the research of mining software repositories, we need to label a large amount of data to
construct a predictive model. The correctness of the labels will affect the performance of a …

Bridging pre-trained models and downstream tasks for source code understanding

D Wang, Z Jia, S Li, Y Yu, Y **ong, W Dong… - Proceedings of the 44th …, 2022 - dl.acm.org
With the great success of pre-trained models, the pretrain-then-finetune paradigm has been
widely adopted on downstream tasks for source code understanding. However, compared to …

Infercode: Self-supervised learning of code representations by predicting subtrees

NDQ Bui, Y Yu, L Jiang - 2021 IEEE/ACM 43rd International …, 2021 - ieeexplore.ieee.org
Learning code representations has found many uses in software engineering, such as code
classification, code search, comment generation, and bug prediction, etc. Although …

A systematic literature review on the use of deep learning in software engineering research

C Watson, N Cooper, DN Palacio, K Moran… - ACM Transactions on …, 2022 - dl.acm.org
An increasingly popular set of techniques adopted by software engineering (SE)
researchers to automate development tasks are those rooted in the concept of Deep …

Improved automatic summarization of subroutines via attention to file context

S Haque, A LeClair, L Wu, C McMillan - Proceedings of the 17th …, 2020 - dl.acm.org
Software documentation largely consists of short, natural language summaries of the
subroutines in the software. These summaries help programmers quickly understand what a …

Deep learning-based software engineering: progress, challenges, and opportunities

X Chen, X Hu, Y Huang, H Jiang, W Ji, Y Jiang… - Science China …, 2025 - Springer
Researchers have recently achieved significant advances in deep learning techniques,
which in turn has substantially advanced other research disciplines, such as natural …