Abstract | ||
---|---|---|
Large-scale network and graph analysis has received considerable attention recently. Graph mining techniques often involve an iterative algorithm, which can be implemented in a variety of ways. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve more than 28x the performance of standard PageRank implementations (e.g., those in GraphLab). The design choices affect both single-threaded performance as well as parallel scalability. The implementation lessons not only guide efficient implementations of many graph mining algorithms, but also provide a framework for designing new scalable algorithms. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1007/978-3-662-48096-0_34 | Lecture Notes in Computer Science |
Keywords | Field | DocType |
Scalable computing,Graph analytics,PageRank,Multi-threaded programming,Data-driven algorithm | PageRank,Data-driven,Algorithm design,Computer science,Scheduling (computing),Algorithm,Power graph analysis,Implementation,Data access,Scalability | Conference |
Volume | ISSN | Citations |
9233 | 0302-9743 | 7 |
PageRank | References | Authors |
0.53 | 7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Joyce Jiyoung Whang | 1 | 161 | 9.43 |
Andrew Lenharth | 2 | 456 | 19.94 |
Inderjit S. Dhillon | 3 | 7601 | 450.15 |
Keshav Pingali | 4 | 3056 | 256.64 |