Title
Software plagiarism detection: a graph-based approach
Abstract
As plagiarism of software increases rapidly, there are growing needs for software plagiarism detection systems. In this paper, we propose a software plagiarism detection system using an API-labeled control flow graph (A-CFG) that abstracts the functionalities of a program. The A-CFG can reflect both the sequence and the frequency of APIs, while previous work rarely considers both of them together. To perform a scalable comparison of a pair of A-CFGs, we use random walk with restart (RWR) that computes an importance score for each node in a graph. By the RWR, we can generate a single score vector for an A-CFG and can also compare A-CFGs by comparing their score vectors. Extensive evaluations on a set of Windows applications demonstrate the effectiveness and the scalability of our proposed system compared with existing methods.
Year
DOI
Venue
2013
10.1145/2505515.2507848
CIKM
Keywords
Field
DocType
random walk,windows application,extensive evaluation,graph-based approach,score vector,proposed system,importance score,software plagiarism detection system,previous work,api-labeled control flow graph,single score vector,similarity,graph
Data mining,Graph,Information retrieval,Plagiarism detection,Control flow graph,Computer science,Random walk,Binary analysis,Software,Artificial intelligence,Machine learning,Scalability
Conference
Citations 
PageRank 
References 
13
0.76
10
Authors
5
Name
Order
Citations
PageRank
Dong-Kyu Chae15910.07
Jiwoon Ha2666.95
Sang-Wook Kim3792152.77
BooJoong Kang411811.55
Eul Gyu Im517524.80