Abstract | ||
---|---|---|
Disk-based graph systems store part or all of graph data on external devices like hard drives or SSDs, achieving scalability without excessive hardware. However, massive expensive disk I/Os remain the major performance bottleneck of disk-based graph processing. In this paper, we propose Redio, a new approach to accelerating disk-based graph processing by reducing disk I/Os. First, Redio observes that it is feasible to accommodate all vertex states in main memory and this can eliminate almost all vertex-related disk I/Os. Second, Redio introduces a dynamic selective scheduling scheme to identify inactive edges in each iteration and skip them when and only when such skipping can bring performance benefit. To improve its effectiveness, Redioin corporates a compact edge storage to improve data locality and an indexed bitmap to minimize its memory and computation overheads. We have implemented a single-node prototype for Redio under the edge-centric computation model. Extensive experiments show that Redio consistently outperforms well-known edge-centric disk-based systems in all experiments, delivering an average speedup of
<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$4.33\times$</tex-math><alternatives><inline-graphic xlink:href="zhang-ieq1-2875458.gif"/></alternatives></inline-formula>
on HDDs and
<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$5.33\times$</tex-math><alternatives><inline-graphic xlink:href="zhang-ieq2-2875458.gif"/></alternatives></inline-formula>
on SSDs over the fastest among them (i.e., GridGraph). Experimental results also show that Redio delivers an average speedup of
<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$3.13\times$</tex-math><alternatives><inline-graphic xlink:href="zhang-ieq3-2875458.gif"/></alternatives></inline-formula>
on HDDs and
<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1.28\times$</tex-math><alternatives><inline-graphic xlink:href="zhang-ieq4-2875458.gif"/></alternatives></inline-formula>
on SSDs over the fastest among representative vertex-centric disk-based systems (i.e., FlashGraph). |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/TC.2018.2875458 | IEEE Transactions on Computers |
Keywords | Field | DocType |
Computational modeling,Acceleration,Optimization,Hardware,Performance evaluation,Dynamic scheduling,Indexes | Bottleneck,Locality,Scheduling (computing),Computer science,Parallel computing,Bitmap,Dynamic priority scheduling,Computation,Speedup,Scalability | Journal |
Volume | Issue | ISSN |
68 | 3 | 0018-9340 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wu, Cheng-Wen | 1 | 1843 | 170.44 |
Guangyan Zhang | 2 | 171 | 16.20 |
Yang Wang | 3 | 31 | 4.23 |
Jiang, X. | 4 | 1 | 2.07 |
Weimin Zheng | 5 | 1889 | 182.48 |