Title
SGraph: A Distributed Streaming System for Processing Big Graphs.
Abstract
Big graph processing has been widely used in various computational domains, ranging from language modeling to social networks. Graph-parallel systems have been proposed to process such big graphs on clusters with up to hundreds of nodes. However, the size of a big graph often exceeds the available main memories in a small cluster. As a consequence, task failures happen frequently. To address this problem, we propose SGraph, a distributed streaming graph processing system built on top of Spark. SGraph introduces a streaming data model to avoid loading all of the graph data which may exceed the available RAM space. In addition, SGraph leverages an edge-centric scatter-gather computing model that can be used to conveniently implement graph algorithms. Experiments demonstrate that SGraph can process graphs with up to 1.5 billion edges on small clusters with several low-cost commodity PCs, whereas existing systems may require up to tens or hundreds of high-end machines. Furthermore, SGraph is up to 2.3 times faster than existing systems.
Year
DOI
Venue
2016
10.1007/978-3-319-42553-5_24
Lecture Notes in Computer Science
Keywords
Field
DocType
Distributed computing,Graph processing,Streaming
Data mining,Cluster (physics),Graph algorithms,Graph,Social network,Spark (mathematics),Computer science,Ranging,Streaming data,Language model,Distributed computing
Conference
Volume
ISSN
Citations 
9784
0302-9743
0
PageRank 
References 
Authors
0.34
6
5
Name
Order
Citations
PageRank
Cheng Chen100.34
Hejun Wu224223.03
Dyce Jing Zhao351.43
Da Yan438734.45
James Cheng52044101.89