A survey of machine learning for big code and naturalness
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …
engineering has recently taken important steps in proposing learnable probabilistic models …
Deep learning for source code modeling and generation: Models, applications, and challenges
Deep Learning (DL) techniques for Natural Language Processing have been evolving
remarkably fast. Recently, the DL advances in language modeling, machine translation, and …
remarkably fast. Recently, the DL advances in language modeling, machine translation, and …
Graph neural networks: foundation, frontiers and applications
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …
recent years. Graph neural networks, also known as deep learning on graphs, graph …
Codebleu: a method for automatic evaluation of code synthesis
Evaluation metrics play a vital role in the growth of an area as it defines the standard of
distinguishing between good and bad models. In the area of code synthesis, the commonly …
distinguishing between good and bad models. In the area of code synthesis, the commonly …
Studying the usage of text-to-text transfer transformer to support code-related tasks
Deep learning (DL) techniques are gaining more and more attention in the software
engineering community. They have been used to support several code-related tasks, such …
engineering community. They have been used to support several code-related tasks, such …
Learning and evaluating contextual embedding of source code
Recent research has achieved impressive results on understanding and improving source
code by building up on machine-learning techniques developed for natural languages. A …
code by building up on machine-learning techniques developed for natural languages. A …
Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task
We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-
SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 …
SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 …
code2seq: Generating sequences from structured representations of code
The ability to generate natural language sequences from source code snippets has a variety
of applications such as code summarization, documentation, and retrieval. Sequence-to …
of applications such as code summarization, documentation, and retrieval. Sequence-to …