Parallelization of reordering algorithms for bandwidth and wavefront reduction - Citegraph

Paper Info

Title
Parallelization of reordering algorithms for bandwidth and wavefront reduction

Abstract
Many sparse matrix computations can be speeded up if the matrix is first reordered. Reordering was originally developed for direct methods but it has recently become popular for improving the cache locality of parallel iterative solvers since reordering the matrix to reduce bandwidth and wavefront can improve the locality of reference of sparse matrix-vector multiplication (SpMV), the key kernel in iterative solvers. In this paper, we present the first parallel implementations of two widely used reordering algorithms: Reverse Cuthill-McKee (RCM) and Sloan. On 16 cores of the Stampede supercomputer, our parallel RCM is 5.56 times faster on the average than a state-of-the-art sequential implementation of RCM in the HSL library. Sloan is significantly more constrained than RCM, but our parallel implementation achieves a speedup of 2.88X on the average over sequential HSL-Sloan. Reordering the matrix using our parallel RCM and then performing 100 SpMV iterations is twice as fast as using HSL-RCM and then performing the SpMV iterations; it is also 1.5 times faster than performing the SpMV iterations without reordering the matrix.

Year	DOI	Venue
2014	10.1109/SC.2014.80	SC
Keywords	Field	DocType
programming model,load balancing	Kernel (linear algebra),Locality of reference,Supercomputer,Matrix (mathematics),Load balancing (computing),Computer science,Parallel computing,Algorithm,Bandwidth (signal processing),Multiplication,Speedup,Distributed computing	Conference
ISSN	Citations	PageRank
2167-4329	10	0.60
References	Authors
23	5

Authors (5 rows)

Cited by (10 rows)

References (23 rows)

Name	Order	Citations	PageRank
Konstantinos I. Karantasis	1	10	0.60
Andrew Lenharth	2	456	19.94
Donald Nguyen	3	419	17.94
María Jesús Garzarán	4	411	34.13
Keshav Pingali	5	3056	256.64

1