Title | ||
---|---|---|
DSP-CC: I/O Efficient Parallel Computation of Connected Components in Billion-scale Networks |
Abstract | ||
---|---|---|
Computing connected components is a core operation on graph data. Since billion-scale graphs cannot be resident in memory of a single server, several approaches based on distributed machines have recently been proposed. The representative methods are Hash-To-Min and PowerGraph. Hash-To-Min is the state-of-the art disk-based distributed method which minimizes the number of MapReduce rounds. PowerGraph is the-state-of-the-art in-memory distributed system, which is typically faster than the disk-based distributed one, however, requires a lot of machines for handling billion-scale graphs. In this paper, we propose an I/O efficient parallel algorithm for billion-scale graphs in a single PC. We first propose the Disk-based Sequential access-oriented Parallel processing (DSP) model that exploits sequential disk access in terms of disk I/Os and parallel processing in terms of computation. We then propose an ultra-fast disk-based parallel algorithm for computing connected components, DSP-CC, which largely improves the performance through sequential disk scan and page-level cache-conscious parallel processing. Extensive experimental results show that DSP-CC 1) computes connected components in billion-scale graphs using the limited memory size whereas in-memory algorithms can only support medium-sized graphs with the same memory size, and 2) significantly outperforms all distributed competitors as well as a representative disk-based parallel method. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/TKDE.2015.2419665 | IEEE Transactions on Knowledge and Data Engineering |
Keywords | Field | DocType |
Graphs, disk-based, parallel, connected components, SSD | Graph,Digital signal processing,Parallel algorithm,Computer science,Parallel computing,Input/output,Memory management,Connected component,Hash function,Computation | Journal |
Volume | Issue | ISSN |
PP | 99 | 1041-4347 |
Citations | PageRank | References |
2 | 0.36 | 19 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Min-Soo Kim | 1 | 5 | 2.11 |
Sangyeon Lee | 2 | 133 | 3.82 |
Wook-Shin Han | 3 | 805 | 57.85 |
Himchan Park | 4 | 41 | 4.73 |
Jeong-Hoon Lee | 5 | 291 | 16.06 |