Title
I/O-efficient calculation of H-group closeness centrality over disk-resident graphs.
Abstract
We introduce H-group closeness centrality in this work. H-group closeness centrality of a group of nodes measures how close this node group is to other nodes in a graph, and can be used in numerous applications such as measuring the importance and influence of a group of users in a social network. When a large graph contains billions of edges that cannot reside entirely in a computers main memory, computing and maximizing H-group closeness centrality both become challenging. In this work, we present a systematic solution for efficiently computing and maximizing H-group closeness centrality in large disk-resident graphs, only using a common PC. Our solution leverages a probabilistic counting method to efficiently estimate H-group closeness with high accuracy, rather than exhaustively computing it in an exact fashion. Furthermore, we design an I/O-efficient greedy algorithm to find a node group that approximately maximizes H-group closeness centrality in a graph. Our algorithm exploits several appealing properties of the defined H-group closeness measure to reduce the computational cost of handling a disk-resident graph. Extensive evaluation on large real-world and synthetic graphs demonstrates the efficiency of our proposed method. For example, our proposed I/O-efficient greedy algorithm is about 300 times faster than a simple multi-pass method on the Twitter graph with 1.4 billion edges. This reduces the running time of identifying one group member from nearly an hour to less than 20s on average.
Year
DOI
Venue
2017
10.1016/j.ins.2017.03.017
Inf. Sci.
Keywords
Field
DocType
Online social network,Group centrality,Disk-resident graph,Submodularity,Greedy algorithm
Network science,Discrete mathematics,Random walk closeness centrality,Closeness,Centrality,Network controllability,Greedy algorithm,Betweenness centrality,Network theory,Artificial intelligence,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
400
C
0020-0255
Citations 
PageRank 
References 
1
0.35
29
Authors
5
Name
Order
Citations
PageRank
Junzhou Zhao110418.36
Ping-Hui Wang223633.39
John C.S. Lui33680279.85
Don Towsley4186931951.05
X. Guan51169137.97