Title
ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths
Abstract
SimRank is a popular measurement for evaluating the node-to-node similarities based on the graph topology. In recent years, single-source and top-k SimRank queries have received increasing attention due to their applications in web mining, social network analysis, and spam detection. However, a fundamental obstacle in studying SimRank has been the lack of ground truths. The only exact algorithm, Power Method, is computationally infeasible on graphs with more than $$10^6$$ nodes. Consequently, no existing work has evaluated the actual accuracy of various single-source and top-k SimRank algorithms on large real-world graphs. In this paper, we present ExactSim, the first algorithm that computes the exact single-source and top-k SimRank results on large graphs. This algorithm produces ground truths with precision up to 7 decimal places with high probability. With the ground truths computed by ExactSim, we present the first experimental study of the accuracy/cost trade-offs of existing approximate SimRank algorithms on large real-world graphs and synthetic graphs. Finally, we use the ground truths to exploit various properties of SimRank distributions on large graphs.
Year
DOI
Venue
2021
10.1007/s00778-021-00672-7
The VLDB Journal
Keywords
DocType
Volume
SimRank, Single-source, Exact computation, Ground truths, Power-law, Benchmarks
Journal
30
Issue
ISSN
Citations 
6
1066-8888
0
PageRank 
References 
Authors
0.34
8
6
Name
Order
Citations
PageRank
Wang Hanzhi142.10
Zhewei Wei221520.07
Yu Liu321.38
Ye Yuan411724.40
Xiaoyong Du571.15
Ji-Rong Wen64431265.98