A survey of machine learning for big code and naturalness

M Allamanis, ET Barr, P Devanbu… - ACM Computing Surveys …, 2018 - dl.acm.org
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …

Deep learning for source code modeling and generation: Models, applications, and challenges

THM Le, H Chen, MA Babar - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Deep Learning (DL) techniques for Natural Language Processing have been evolving
remarkably fast. Recently, the DL advances in language modeling, machine translation, and …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

Codebleu: a method for automatic evaluation of code synthesis

S Ren, D Guo, S Lu, L Zhou, S Liu, D Tang… - arxiv preprint arxiv …, 2020 - arxiv.org
Evaluation metrics play a vital role in the growth of an area as it defines the standard of
distinguishing between good and bad models. In the area of code synthesis, the commonly …

Studying the usage of text-to-text transfer transformer to support code-related tasks

A Mastropaolo, S Scalabrino, N Cooper… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
Deep learning (DL) techniques are gaining more and more attention in the software
engineering community. They have been used to support several code-related tasks, such …

Learning and evaluating contextual embedding of source code

A Kanade, P Maniatis… - … on machine learning, 2020 - proceedings.mlr.press
Recent research has achieved impressive results on understanding and improving source
code by building up on machine-learning techniques developed for natural languages. A …

Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task

T Yu, R Zhang, K Yang, M Yasunaga, D Wang… - arxiv preprint arxiv …, 2018 - arxiv.org
We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-
SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 …

code2seq: Generating sequences from structured representations of code

U Alon, S Brody, O Levy, E Yahav - arxiv preprint arxiv:1808.01400, 2018 - arxiv.org
The ability to generate natural language sequences from source code snippets has a variety
of applications such as code summarization, documentation, and retrieval. Sequence-to …