Abstract | ||
---|---|---|
Huge volumes of data pose many opportunities and challenges in business and information societies. The similar pair identification problem happens in various fields such as image retrieval, near-duplicate document identification, plagiarism analysis, entity resolution, and so on. With the increasing number of items, it is not efficient to make pair-wise similarity comparisons. To handle this problem in an efficient way, various techniques have been developed. The locality-sensitive hashing is one of such techniques to avoid pair-wise comparisons in avoiding similar pairs. This paper introduces a modified method of the projection-based locality sensitive hashing technique. The proposed method reduces the chances that similar pairs fall into different buckets which is one of major drawbacks in the projection-based technique. We have observed that the proposed method outperforms the conventional projection-based method in that it gets better recall rate with some additional memory and computation costs. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/SCIS-ISIS.2012.6505385 | SCIS&ISIS |
Keywords | Field | DocType |
cryptography,entity resolution,image retrieval,locality-sensitive hashing technique,near-duplicate document identification,pair-wise similarity,plagiarism analysis,projection-based locality sensitive hashing,similar pair identification | Locality-sensitive hashing,Data mining,Pattern recognition,Computer science,Cryptography,Image retrieval,Artificial intelligence,Hash function,2-choice hashing,Parameter identification problem,Dynamic perfect hashing,Computation | Conference |
ISSN | ISBN | Citations |
2377-6870 | 978-1-4673-2742-8 | 4 |
PageRank | References | Authors |
0.51 | 6 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kyung Mi Lee | 1 | 7 | 1.94 |
Keon Myung Lee | 2 | 60 | 18.73 |