Title
Similarity of Source Code in the Presence of Pervasive Modifications
Abstract
Source code analysis to detect code cloning, code plagiarism, and code reuse suffers from the problem of pervasive code modifications, i.e. transformations that may have a global effect. We compare 30 similarity detection techniques and tools against pervasive code modifications. We evaluate the tools using two experimental scenarios for Java source code. These are (1) pervasive modifications created with tools for source code and bytecode obfuscation and (2) source code normalisation through compilation and decompilation using different decompilers. Our experimental results show that highly specialised source code similarity detection techniques and tools can perform better than more general, textual similarity measures. Our study strongly validates the use of compilation/decompilation as a normalisation technique. Its use reduced false classifications to zero for six of the tools. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users of similarity detection in source code.
Year
DOI
Venue
2016
10.1109/SCAM.2016.13
2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)
Keywords
Field
DocType
source code similarity,decompilation,code normalisation,code cloning,code reuse,code plagiarism
Codebase,Static program analysis,Programming language,Source code,Computer science,Code generation,Theoretical computer science,KPI-driven code analysis,Code reuse,Bytecode,Code review
Conference
ISSN
ISBN
Citations 
1942-5430
978-1-5090-3849-7
4
PageRank 
References 
Authors
0.43
28
3
Name
Order
Citations
PageRank
Chaiyong Ragkhitwetsagul141.11
Jens Krinke2153376.35
David M. Clark315316.33