Title
Comparison of Overlap Detection Techniques
Abstract
Easy access to the World Wide Web has raised concerns about copyright issues and plagiarism. It is easy to copy someone else's work and submit it as someone's own. This problem has been targeted by many systems, which use very similar approaches. These approaches are compared in this paper and suggestions are made when different strategies are more applicable than others. Some alternative approaches are proposed that perform better than previously presented methods. These previous methods share two common stages: chunking of documents and selection of representative chunks. We study both stages and also propose alternatives that are better in terms of accuracy and space requirement. The applications of these methods are not limited to plagiarism detection but may target other copy-detection problems. We also propose a third stage to be applied in the comparison that uses suffix trees and suffix vectors to identify the overlapping chunks.
Year
DOI
Venue
2002
10.1007/3-540-46043-8_4
International Conference on Computational Science (1)
Keywords
Field
DocType
previous methods share,different strategy,common stage,easy access,copy-detection problem,overlap detection techniques,suffix tree,alternative approach,suffix vector,world wide web,copyright issue
Plagiarism detection,Suffix,Computer science,Artificial intelligence,Chunking (psychology),Suffix tree,Digital library,String (computer science),Machine learning,Distributed computing
Conference
Volume
ISSN
ISBN
2329
0302-9743
3-540-43591-3
Citations 
PageRank 
References 
18
1.34
5
Authors
5
Name
Order
Citations
PageRank
Krisztián Monostori1899.91
raphael a finkel21080368.97
Arkady B. Zaslavsky3943168.27
Gábor Hodász4182.69
Máté Pataki5244.15