A survey of machine learning for big code and naturalness
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …
engineering has recently taken important steps in proposing learnable probabilistic models …
A review of chatgpt applications in education, marketing, software engineering, and healthcare: Benefits, drawbacks, and research directions
ChatGPT is a type of artificial intelligence language model that uses deep learning
algorithms to generate human-like responses to text-based prompts. The introduction of the …
algorithms to generate human-like responses to text-based prompts. The introduction of the …
Repocoder: Repository-level code completion through iterative retrieval and generation
The task of repository-level code completion is to continue writing the unfinished code based
on a broader context of the repository. While for automated code completion tools, it is …
on a broader context of the repository. While for automated code completion tools, it is …
Few-shot training LLMs for project-specific code-summarization
Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-
art performance on several natural-language tasks, and show great promise also for code. A …
art performance on several natural-language tasks, and show great promise also for code. A …
On the robustness of code generation techniques: An empirical study on github copilot
Software engineering research has always being concerned with the improvement of code
completion approaches, which suggest the next tokens a developer will likely type while …
completion approaches, which suggest the next tokens a developer will likely type while …
Multi-task learning based pre-trained language model for code completion
Code completion is one of the most useful features in the Integrated Development
Environments (IDEs), which can accelerate software development by suggesting the next …
Environments (IDEs), which can accelerate software development by suggesting the next …
Big code!= big vocabulary: Open-vocabulary models for source code
Statistical language modeling techniques have successfully been applied to large source
code corpora, yielding a variety of new software development tools, such as tools for code …
code corpora, yielding a variety of new software development tools, such as tools for code …
Reacc: A retrieval-augmented code completion framework
Code completion, which aims to predict the following code token (s) according to the code
context, can improve the productivity of software development. Recent work has proved that …
context, can improve the productivity of software development. Recent work has proved that …
Repobench: Benchmarking repository-level code auto-completion systems
Large Language Models (LLMs) have greatly advanced code auto-completion systems, with
a potential for substantial productivity enhancements for developers. However, current …
a potential for substantial productivity enhancements for developers. However, current …
Deep learning code fragments for code clone detection
Code clone detection is an important problem for software maintenance and evolution. Many
approaches consider either structure or identifiers, but none of the existing detection …
approaches consider either structure or identifiers, but none of the existing detection …