Title
Scalable Distributed Hashing for Approximate Nearest Neighbor Search
Abstract
Hashing has been widely applied to the large-scale approximate nearest neighbor search problem owing to its high efficiency and low storage requirement. Most investigations concentrate on learning hashing methods in a centralized setting. However, in existing big data systems, data is often stored across different nodes. In some situations, data is even collected in a distributed manner. A straightforward way to solve this problem is to aggregate all the data into the fusion center to obtain the search result (aggregating method). However, this strategy is not feasible because of the prohibitive communication cost. Although a few distributed hashing methods have been proposed to reduce this cost, they only focus on designing a distributed algorithm for a specific global optimization objective without considering scalability. Moreover, existing distributed hashing methods aim at finding a distributed solution to hashing, meanwhile avoiding accuracy loss, rather than improving accuracy. To address these challenges, we propose a Scalable Distributed Hashing (SDisH) model in which most existing hashing methods can be extended to process distributed data with no changes. Furthermore, to improve accuracy, we utilize the search radius as a global variable across different nodes to achieve a global optimum search result for every iteration. In addition, a voting algorithm is presented based on the results produced by multiple iterations to further reduce search errors. Theoretical analyses of communication, computation, and accuracy demonstrate the superiority of the proposed model. Numerical simulations on three large-scale and two relatively small benchmark datasets also show that the SDisH model achieves up to 44.75% and 10.23% accuracy gains compared to the aggregating method and state-of-the-art distributed hashing methods, respectively.
Year
DOI
Venue
2022
10.1109/TIP.2021.3130528
IEEE Transactions on Image Processing
Keywords
DocType
Volume
Hashing,distributed,approximate nearest neighbor search
Journal
31
Issue
ISSN
Citations 
1
1057-7149
0
PageRank 
References 
Authors
0.34
24
7
Name
Order
Citations
PageRank
Yuan Cao1112.86
Junwei Liu200.34
Heng Qi321830.45
Jie Gui470025.72
Keqiu Li500.34
Jieping Ye66943351.37
Chao Liu7107.00