Title
αDiff: cross-version binary code similarity detection with DNN.
Abstract
Binary code similarity detection (BCSD) has many applications, including patch analysis, plagiarism detection, malware detection, and vulnerability search etc. Existing solutions usually perform comparisons over specific syntactic features extracted from binary code, based on expert knowledge. They have either high performance overheads or low detection accuracy. Moreover, few solutions are suitable for detecting similarities between cross-version binaries, which may not only diverge in syntactic structures but also diverge slightly in semantics. In this paper, we propose a solution αDiff, employing three semantic features, to address the cross-version BCSD challenge. It first extracts the intra-function feature of each binary function using a deep neural network (DNN). The DNN works directly on raw bytes of each function, rather than features (e.g., syntactic structures) provided by experts. αDiff further analyzes the function call graph of each binary, which are relatively stable in cross-version binaries, and extracts the inter-function and inter-module features. Then, a distance is computed based on these three features and used for BCSD. We have implemented a prototype of αDiff, and evaluated it on a dataset with about 2.5 million samples. The result shows that αDiff outperforms state-of-the-art static solutions by over 10 percentages on average in different BCSD settings.
Year
Venue
Field
2018
ASE
Byte,Pattern recognition,Plagiarism detection,Subroutine,Computer science,Binary code,Binary function,Theoretical computer science,Artificial intelligence,Artificial neural network,Malware,Binary number
DocType
ISBN
Citations 
Conference
978-1-4503-5937-5
2
PageRank 
References 
Authors
0.35
36
7
Name
Order
Citations
PageRank
Bingchang Liu162.08
Wei Huo2476.02
Chao Zhang342338.17
Wenchao Li410120.69
Feng Li583.46
Aihua Piao631.37
Zou Wei7407.40