Title
Current Flow Betweenness Centrality with Apache Spark.
Abstract
The identification of the most central nodes of a graph is a fundamental task of data analysis. The current flow betweenness is a centrality index which considers how the information flows along all the paths of a graph, not only on the shortest ones. Finding the exact value of the current flow betweenness is computationally expensive for large graphs, so the definition of algorithms returning an approximation of this measure is mandatory. In this paper we propose a solution, based on the Gather Apply Scatter model, that estimates the current flow betweenness in a distributed setting using the Apache Spark framework. The experimental evaluation shows that the algorithm achieves high correlation with the exact value of the index and outperforms other algorithms.
Year
DOI
Venue
2016
10.1007/978-3-319-49583-5_21
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016
Keywords
Field
DocType
Centrality measure,Thinking like a vertex,Apache Spark
Graph,Spark (mathematics),Computer science,Flow (psychology),Parallel computing,Centrality,Theoretical computer science,Betweenness centrality,Artificial intelligence,Machine learning
Conference
Volume
ISSN
Citations 
10048
0302-9743
0
PageRank 
References 
Authors
0.34
20
3
Name
Order
Citations
PageRank
Massimiliano Bertolucci150.75
Alessandro Lulli28210.35
L. Ricci38214.76