Self-supervised contrastive learning for code retrieval and summarization via semantic-preserving transformations

NDQ Bui, Y Yu, L Jiang - Proceedings of the 44th International ACM …, 2021 - dl.acm.org
We propose Corder, a self-supervised contrastive learning framework for source code
model. Corder is designed to alleviate the need of labeled data for code retrieval and code …

ExploitGen: Template-augmented exploit code generation based on CodeBERT

G Yang, Y Zhou, X Chen, X Zhang, T Han… - Journal of Systems and …, 2023 - Elsevier
Exploit code is widely used for detecting vulnerabilities and implementing defensive
measures. However, automatic generation of exploit code for security assessment is a …

Contrastive code representation learning

P Jain, A Jain, T Zhang, P Abbeel, JE Gonzalez… - arxiv preprint arxiv …, 2020 - arxiv.org
Recent work learns contextual representations of source code by reconstructing tokens from
their context. For downstream semantic understanding tasks like summarizing code in …

Exploring software naturalness through neural language models

L Buratti, S Pujar, M Bornea, S McCarley… - arxiv preprint arxiv …, 2020 - arxiv.org
The Software Naturalness hypothesis argues that programming languages can be
understood through the same techniques used in natural language processing. We explore …

[HTML][HTML] Automatic detection of Long Method and God Class code smells through neural source code embeddings

A Kovačević, J Slivka, D Vidaković, KG Grujić… - Expert Systems with …, 2022 - Elsevier
Code smells are structures in code that often harm its quality. Manually detecting code
smells is challenging, so researchers proposed many automatic detectors. Traditional code …

Evaluating the Usability and Functionality of Intelligent Source Code Completion Assistants: A Comprehensive Review

T Hliš, L Četina, T Beranič, L Pavlič - Applied Sciences, 2023 - mdpi.com
As artificial intelligence advances, source code completion assistants are becoming more
advanced and powerful. Existing traditional assistants are no longer up to all the developers' …

On the effectiveness of transfer learning for code search

P Salza, C Schwizer, J Gu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
The Transformer architecture and transfer learning have marked a quantum leap in natural
language processing, improving the state of the art across a range of text-based tasks. This …

Benchmarking causal study to interpret large language models for source code

D Rodriguez-Cardenas, DN Palacio… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
One of the most common solutions adopted by software researchers to address code
generation is by training Large Language Models (LLMs) on massive amounts of source …

Toward a theory of causation for interpreting neural code models

DN Palacio, A Velasco, N Cooper… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Neural Language Models of Code, or Neural Code Models (NCMs), are rapidly progressing
from research prototypes to commercial developer tools. As such, understanding the …

Automatic detection of code smells using metrics and CodeT5 embeddings: a case study in C#

A Kovačević, N Luburić, J Slivka, S Prokić… - Neural Computing and …, 2024 - Springer
Code smells are poorly designed code structures indicating that the code may need to be
refactored. Recognizing code smells in practice is complex, and researchers strive to …