Title
Similar pair identification using locality-sensitive hashing technique
Abstract
Huge volumes of data pose many opportunities and challenges in business and information societies. The similar pair identification problem happens in various fields such as image retrieval, near-duplicate document identification, plagiarism analysis, entity resolution, and so on. With the increasing number of items, it is not efficient to make pair-wise similarity comparisons. To handle this problem in an efficient way, various techniques have been developed. The locality-sensitive hashing is one of such techniques to avoid pair-wise comparisons in avoiding similar pairs. This paper introduces a modified method of the projection-based locality sensitive hashing technique. The proposed method reduces the chances that similar pairs fall into different buckets which is one of major drawbacks in the projection-based technique. We have observed that the proposed method outperforms the conventional projection-based method in that it gets better recall rate with some additional memory and computation costs.
Year
DOI
Venue
2012
10.1109/SCIS-ISIS.2012.6505385
SCIS&ISIS
Keywords
Field
DocType
cryptography,entity resolution,image retrieval,locality-sensitive hashing technique,near-duplicate document identification,pair-wise similarity,plagiarism analysis,projection-based locality sensitive hashing,similar pair identification
Locality-sensitive hashing,Data mining,Pattern recognition,Computer science,Cryptography,Image retrieval,Artificial intelligence,Hash function,2-choice hashing,Parameter identification problem,Dynamic perfect hashing,Computation
Conference
ISSN
ISBN
Citations 
2377-6870
978-1-4673-2742-8
4
PageRank 
References 
Authors
0.51
6
2
Name
Order
Citations
PageRank
Kyung Mi Lee171.94
Keon Myung Lee26018.73