Title
Api-Based Software Birthmarking Method Using Fuzzy Hashing
Abstract
The software birthmarking technique has conventionally been studied in fields such as software piracy, code theft, and copyright infringement. The most recent API-based software birthmarking method (Han et al., 2014) extracts API call sequences in entire code sections of a program. Additionally, it is generated as a birthmark using a cryptographic hash function (MD5). It was reported that different application types can be categorized in a program through pre-filtering based on DLL/API numbers/names. However, similarity cannot be measured owing to the cryptographic hash function, occurrence of false negatives, and it is difficult to functionally categorize applications using only DLL/API numbers/names. In this paper, we propose an API-based software birthmarking method using fuzzy hashing. For the native code of a program, our software birthmarking technique extracts API call sequences in the segmented procedures and then generates them using a fuzzy hash function. Unlike the conventional cryptographic hash function, the fuzzy hash is used for the similarity measurement of data. Our method using a fuzzy hash function achieved a high reduction ratio (about 41% on average) more than an original birthmark that is generated with only the API call sequences. In our experiments, when threshold epsilon is 0.35, the results show that our method is an effective birthmarking system to measure similarities of the software. Moreover, our correlation analysis with top 50 API call frequencies proves that it is difficult to functionally categorize applications using only DLL/API numbers/names. Compared to prior work, our method significantly improves the properties of resilience and credibility.
Year
DOI
Venue
2016
10.1587/transinf.2015EDP7379
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
Field
DocType
software birthmark, birthmarking systems, software similarity, fuzzy hash, API-based sequences
Data mining,Computer science,Software birthmark,Fuzzy logic,Software,Hash function
Journal
Volume
Issue
ISSN
E99D
7
1745-1361
Citations 
PageRank 
References 
0
0.34
11
Authors
5
Name
Order
Citations
PageRank
Donghoon Lee115122.04
Dongwoo Kang215319.98
Younsung Choi3575.49
Jiye Kim41187.87
Dongho Won51262154.14