Binary code summarization: Benchmarking chatgpt/gpt-4 and other large language models

X **, J Larson, W Yang, Z Lin - arxiv preprint arxiv:2312.09601, 2023 - arxiv.org
Binary code summarization, while invaluable for understanding code semantics, is
challenging due to its labor-intensive nature. This study delves into the potential of large …

Traced: Execution-aware pre-training for source code

Y Ding, B Steenhoek, K Pei, G Kaiser, W Le… - Proceedings of the 46th …, 2024 - dl.acm.org
Most existing pre-trained language models for source code focus on learning the static code
text, typically augmented with static code structures (abstract syntax tree, dependency …

ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries

D **e, Z Zhang, N Jiang, X Xu, L Tan… - Proceedings of the 2024 …, 2024 - dl.acm.org
Decompilation aims to recover a binary executable to the source code form and hence has a
wide range of applications in cyber security, such as malware analysis and legacy code …

Symmetry-Preserving Program Representations for Learning Code Semantics

K Pei, W Li, Q **, S Liu, S Geng, L Cavallaro… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have shown promise in automated program reasoning, a
crucial aspect of many security tasks. However, existing LLM architectures for code are often …

Define-Use Guided Path Exploration for Better Forced Execution

D He, D **e, Y Wang, W You, B Liang… - Proceedings of the 33rd …, 2024 - dl.acm.org
The evolution of recent malware, characterized by the escalating use of cloaking techniques,
poses a significant challenge in the analysis of malware behaviors. Researchers proposed …

A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer

H Lu, H Cai, Y Liang, A Bianchi, ZB Celik - arxiv preprint arxiv:2412.11177, 2024 - arxiv.org
Language model approaches have recently been integrated into binary analysis tasks, such
as function similarity detection and function signature recovery. These models typically …

[BOOK][B] Analyzing and Securing Software via Robust and Generalizable Learning

K Pei - 2023 - search.proquest.com
Software permeates every facet of our lives, improving their convenience and efficiency, and
its sphere of influence continues to expand, leading to novel applications and services …