Title
Towards memory and computation efficient graph processing on spark
Abstract
Algorithms for large scale natural graph processing can be categorized into two types based on their value propagation behaviors: the unidirectional value propagation (UVP) algorithms and the bidirectional value propagation (BVP) algorithms. The behavior about how vertices interact with neighbors also differs between two algorithm types, which demands different system design choices. However, current distributed graph processing systems usually try to support both types in one general-purpose framework Such system design can not promise good performance and low resource consumption for both types. Especially, for UVP algorithms, current systems can not guarantee low memory footprint, computation efficiency and communication efficiency at the same time. In this paper, we propose a new graph processing engine on Spark, GraphV, which is specially designed for the unidirectional value propagation algorithms, and can satisfy all the above requirements for this type of algorithms. To retain the generalization for other algorithms, we also build a dual-engine framework by integrating GraphV with Spark's existing graph processing engine GraphX. The main design choices of GraphV include a cheap propagation-related partitioner, an one-step computation model, and a locality-aware local graph layout. According to the experiment results, GraphV is faster than GraphX by the factors of 1.2x-3.1x, with much less resource consumption. The source code of GraphV will be publicly available from http://prof.ict.ac.cn/GraphV.
Year
DOI
Venue
2017
10.1109/BigData.2017.8257948
2017 IEEE International Conference on Big Data (Big Data)
Keywords
DocType
ISSN
distributed system,Spark,big graph
Conference
2639-1589
ISBN
Citations 
PageRank 
978-1-5386-2716-7
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Xinhui Tian162.48
Yuanqing Guo200.34
Jianfeng Zhan376762.86
Lei Wang457746.85