Title
PGX.D: a fast distributed graph processing engine
Abstract
Graph analysis is a powerful method in data analysis. Although several frameworks have been proposed for processing large graph instances in distributed environments, their performance is much lower than using efficient single-machine implementations provided with enough memory. In this paper, we present a fast distributed graph processing system, namely PGX.D. We show that PGX.D outperforms other distributed graph systems like GraphLab significantly (3x -- 90x). Furthermore, PGX.D on 4 to 16 machines is also faster than an implementation optimized for single-machine execution. Using a fast cooperative context-switching mechanism, we implement PGX.D as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns. Moreover, PGX.D achieves large traffic reduction and good workload balance by applying selective ghost nodes, edge partitioning, and edge chunking transparently to the user. Our analysis confirms that each of these features is indeed crucial for overall performance of certain kinds of graph algorithms. Finally, we advocate the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.
Year
DOI
Venue
2015
10.1145/2807591.2807620
International Conference for High Performance Computing, Networking, Storage, and Analysis
Keywords
Field
DocType
PGX.D,distributed graph processing engine,graph analysis,data analysis,cooperative context-switching mechanism,low-overhead communication,bandwidth-efficient communication,remote data-pulling patterns,traffic reduction,workload balance,ghost nodes,edge partitioning,edge chunking,graph algorithms,balanced beefy clusters,DRAM,access bandwidth
Graph database,Algorithm design,Computer science,Parallel computing,Power graph analysis,Bandwidth (signal processing),Chunking (psychology),Wait-for graph,Cluster analysis,Graph partition,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-1-5090-0273-3
31
0.79
References 
Authors
20
6
Name
Order
Citations
PageRank
Sungpack Hong186433.20
Siegfried Depner2310.79
Thomas Manhardt3310.79
Jan Van Der Lugt4341.20
Merijn Verstraaten5564.96
Hassan Chafi6111861.11