Code generation using machine learning: A systematic review

E Dehaerne, B Dey, S Halder, S De Gendt… - Ieee …, 2022 - ieeexplore.ieee.org
Recently, machine learning (ML) methods have been used to create powerful language
models for a broad range of natural language processing tasks. An important subset of this …

Code search: A survey of techniques for finding code

L Di Grazia, M Pradel - ACM Computing Surveys, 2023 - dl.acm.org
The immense amounts of source code provide ample challenges and opportunities during
software development. To handle the size of code bases, developers commonly search for …

Codexglue: A machine learning benchmark dataset for code understanding and generation

S Lu, D Guo, S Ren, J Huang, A Svyatkovskiy… - arxiv preprint arxiv …, 2021 - arxiv.org
Benchmark datasets have a significant impact on accelerating research in programming
language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster …

Unifying the perspectives of nlp and software engineering: A survey on language models for code

Z Zhang, C Chen, B Liu, C Liao, Z Gong, H Yu… - arxiv preprint arxiv …, 2023 - arxiv.org
In this work we systematically review the recent advancements in software engineering with
language models, covering 70+ models, 40+ evaluation tasks, 180+ datasets, and 900 …

On the importance of building high-quality training datasets for neural code search

Z Sun, L Li, Y Liu, X Du, L Li - … of the 44th International Conference on …, 2022 - dl.acm.org
The performance of neural code search is significantly influenced by the quality of the
training data from which the neural models are derived. A large corpus of high-quality query …

AST-trans: Code summarization with efficient tree-structured attention

Z Tang, X Shen, C Li, J Ge, L Huang, Z Zhu… - Proceedings of the 44th …, 2022 - dl.acm.org
Code summarization aims to generate brief natural language descriptions for source codes.
The state-of-the-art approaches follow a transformer-based encoder-decoder architecture …

Deep learning-based software engineering: progress, challenges, and opportunities

X Chen, X Hu, Y Huang, H Jiang, W Ji, Y Jiang… - Science China …, 2025 - Springer
Researchers have recently achieved significant advances in deep learning techniques,
which in turn has substantially advanced other research disciplines, such as natural …

PyMT5: multi-mode translation of natural language and Python code with transformers

CB Clement, D Drain, J Timcheck… - arxiv preprint arxiv …, 2020 - arxiv.org
Simultaneously modeling source code and natural language has many exciting applications
in automated software development and understanding. Pursuant to achieving such …

Survey of code search based on deep learning

Y **e, J Lin, H Dong, L Zhang, Z Wu - ACM Transactions on Software …, 2023 - dl.acm.org
Code writing is repetitive and predictable, inspiring us to develop various code intelligence
techniques. This survey focuses on code search, that is, to retrieve code that matches a …

Code to comment" translation" data, metrics, baselining & evaluation

D Gros, H Sezhiyan, P Devanbu, Z Yu - Proceedings of the 35th IEEE …, 2020 - dl.acm.org
The relationship of comments to code, and in particular, the task of generating useful
comments given the code, has long been of interest. The earliest approaches have been …