Title | ||
---|---|---|
ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths |
Abstract | ||
---|---|---|
SimRank is a popular measurement for evaluating the node-to-node similarities based on the graph topology. In recent years, single-source and top-k SimRank queries have received increasing attention due to their applications in web mining, social network analysis, and spam detection. However, a fundamental obstacle in studying SimRank has been the lack of ground truths. The only exact algorithm, Power Method, is computationally infeasible on graphs with more than
$$10^6$$
nodes. Consequently, no existing work has evaluated the actual accuracy of various single-source and top-k SimRank algorithms on large real-world graphs. In this paper, we present ExactSim, the first algorithm that computes the exact single-source and top-k SimRank results on large graphs. This algorithm produces ground truths with precision up to 7 decimal places with high probability. With the ground truths computed by ExactSim, we present the first experimental study of the accuracy/cost trade-offs of existing approximate SimRank algorithms on large real-world graphs and synthetic graphs. Finally, we use the ground truths to exploit various properties of SimRank distributions on large graphs.
|
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/s00778-021-00672-7 | The VLDB Journal |
Keywords | DocType | Volume |
SimRank, Single-source, Exact computation, Ground truths, Power-law, Benchmarks | Journal | 30 |
Issue | ISSN | Citations |
6 | 1066-8888 | 0 |
PageRank | References | Authors |
0.34 | 8 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wang Hanzhi | 1 | 4 | 2.10 |
Zhewei Wei | 2 | 215 | 20.07 |
Yu Liu | 3 | 2 | 1.38 |
Ye Yuan | 4 | 117 | 24.40 |
Xiaoyong Du | 5 | 7 | 1.15 |
Ji-Rong Wen | 6 | 4431 | 265.98 |