Solving path problems on the GPU - Citegraph

Paper Info

Title
Solving path problems on the GPU

Abstract
We consider the computation of shortest paths on Graphic Processing Units (GPUs). The blocked recursive elimination strategy we use is applicable to a class of algorithms (such as all-pairs shortest-paths, transitive closure, and LU decomposition without pivoting) having similar data access patterns. Using the all-pairs shortest-paths problem as an example, we uncover potential gains over this class of algorithms. The impressive computational power and memory bandwidth of the GPU make it an attractive platform to run such computationally intensive algorithms. Although improvements over CPU implementations have previously been achieved for those algorithms in terms of raw speed, the utilization of the underlying computational resources was quite low. We implemented a recursively partitioned all-pairs shortest-paths algorithm that harnesses the power of GPUs better than existing implementations. The alternate schedule of path computations allowed us to cast almost all operations into matrix-matrix multiplications on a semiring. Since matrix-matrix multiplication is highly optimized and has a high ratio of computation to communication, our implementation does not suffer from the premature saturation of bandwidth resources as iterative algorithms do. By increasing temporal locality, our implementation runs more than two orders of magnitude faster on an NVIDIA 8800 GPU than on an Opteron. Our work provides evidence that programmers should rethink algorithms instead of directly porting them to GPU.

Year	DOI	Venue
2010	10.1016/j.parco.2009.12.002	Parallel Computing
Keywords	Field	DocType
shorte st path,linear algebra,cpu implementation,bandwidth resource,matrix multiplication,graph algorithm,graphical processing unit,shortest-paths algorithm,recursively partitioned all-pairs,gaussian elimination,shortest path,all-pairs shortest-paths problem,all-pairs shortest-paths,graphica l processing unit,impressive computational power,semiring,memory bandwidth,path problem,path computation,matrix-matrix multiplication,all-pairs shortest paths,data access,recursive partitioning,iterative algorithm,all pairs shortest path,transitive closure	Locality of reference,Memory bandwidth,Shortest path problem,Computer science,CUDA,Parallel computing,Theoretical computer science,Multiplication,Gaussian elimination,Transitive closure,LU decomposition	Journal
Volume	Issue	ISSN
36	5-6	Parallel Computing
Citations	PageRank	References
46	1.77	24
Authors
3

Authors (3 rows)

Cited by (46 rows)

References (24 rows)

Name	Order	Citations	PageRank
Aydin Buluc	1	1057	67.49
John R. Gilbert	2	2369	308.81
Ceren Budak	3	109	6.60

1