A survey of automatic source code summarization

C Zhang, J Wang, Q Zhou, T Xu, K Tang, H Gui, F Liu - Symmetry, 2022 - mdpi.com
Source code summarization refers to the natural language description of the source code's
function. It can help developers easily understand the semantics of the source code. We can …

How machine learning is solving the binary function similarity problem

A Marcelli, M Graziano, X Ugarte-Pedrero… - 31st USENIX Security …, 2022 - usenix.org
The ability to accurately compute the similarity between two pieces of binary code plays an
important role in a wide range of different problems. Several research communities such as …

A survey of binary code fingerprinting approaches: taxonomy, methodologies, and features

S Alrabaee, M Debbabi, L Wang - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
Binary code fingerprinting is crucial in many security applications. Examples include
malware detection, software infringement, vulnerability analysis, and digital forensics. It is …

Cctest: Testing and repairing code completion systems

Z Li, C Wang, Z Liu, H Wang, D Chen… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Code completion, a highly valuable topic in the software development domain, has been
increasingly promoted for use by recent advances in large language models (LLMs). To …

Binary code summarization: Benchmarking chatgpt/gpt-4 and other large language models

X **, J Larson, W Yang, Z Lin - arxiv preprint arxiv:2312.09601, 2023 - arxiv.org
Binary code summarization, while invaluable for understanding code semantics, is
challenging due to its labor-intensive nature. This study delves into the potential of large …

How could neural networks understand programs?

D Peng, S Zheng, Y Li, G Ke, D He… - … on Machine Learning, 2021 - proceedings.mlr.press
Semantic understanding of programs is a fundamental problem for programming language
processing (PLP). Recent works that learn representations of code based on pre-training …

CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection

W Tang, M Tang, M Ban, Z Zhao, M Feng - Journal of Systems and Software, 2023 - Elsevier
In order to secure software, it is critical to detect potential vulnerabilities. The performance of
traditional static vulnerability detection methods is limited by predefined rules, which rely …

CLAP: Learning transferable binary code representations with natural language supervision

H Wang, Z Gao, C Zhang, Z Sha, M Sun… - Proceedings of the 33rd …, 2024 - dl.acm.org
Binary code representation learning has shown significant performance in binary analysis
tasks. But existing solutions often have poor transferability, particularly in few-shot and zero …

Do code summarization models process too much information? function signature may be all that is needed

X Ding, R Peng, X Chen, Y Huang, J Bian… - ACM Transactions on …, 2024 - dl.acm.org
With the fast development of large software projects, automatic code summarization
techniques, which summarize the main functionalities of a piece of code using natural …

Learning approximate execution semantics from traces for binary function similarity

K Pei, Z Xuan, J Yang, S Jana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Detecting semantically similar binary functions–a crucial capability with broad security
usages including vulnerability detection, malware analysis, and forensics–requires …