A survey on deep learning for software engineering

Y Yang, X **a, D Lo, J Grundy - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
In 2006, Geoffrey Hinton proposed the concept of training “Deep Neural Networks (DNNs)”
and an improved model training method to break the bottleneck of neural network …

Deep learning for source code modeling and generation: Models, applications, and challenges

THM Le, H Chen, MA Babar - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Deep Learning (DL) techniques for Natural Language Processing have been evolving
remarkably fast. Recently, the DL advances in language modeling, machine translation, and …

Evaluating large language models trained on code

M Chen, J Tworek, H Jun, Q Yuan, HPDO Pinto… - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce Codex, a GPT language model fine-tuned on publicly available code from
GitHub, and study its Python code-writing capabilities. A distinct production version of Codex …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

Unified pre-training for program understanding and generation

WU Ahmad, S Chakraborty, B Ray… - arxiv preprint arxiv …, 2021 - arxiv.org
Code summarization and generation empower conversion between programming language
(PL) and natural language (NL), while code translation avails the migration of legacy code …

Graph neural networks for natural language processing: A survey

L Wu, Y Chen, K Shen, X Guo, H Gao… - … and Trends® in …, 2023 - nowpublishers.com
Deep learning has become the dominant approach in addressing various tasks in Natural
Language Processing (NLP). Although text inputs are typically represented as a sequence …

Graphcodebert: Pre-training code representations with data flow

D Guo, S Ren, S Lu, Z Feng, D Tang, S Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Pre-trained models for programming language have achieved dramatic empirical
improvements on a variety of code-related tasks such as code search, code completion …

Open graph benchmark: Datasets for machine learning on graphs

W Hu, M Fey, M Zitnik, Y Dong, H Ren… - Advances in neural …, 2020 - proceedings.neurips.cc
Abstract We present the Open Graph Benchmark (OGB), a diverse set of challenging and
realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine …

Cure: Code-aware neural machine translation for automatic program repair

N Jiang, T Lutellier, L Tan - 2021 IEEE/ACM 43rd International …, 2021 - ieeexplore.ieee.org
Automatic program repair (APR) is crucial to improve software reliability. Recently, neural
machine translation (NMT) techniques have been used to automatically fix software bugs …

Codebert: A pre-trained model for programming and natural languages

Z Feng, D Guo, D Tang, N Duan, X Feng… - arxiv preprint arxiv …, 2020 - arxiv.org
We present CodeBERT, a bimodal pre-trained model for programming language (PL) and
nat-ural language (NL). CodeBERT learns general-purpose representations that support …