Breaking The Performance Bottleneck Of Sparse Matrix-Vector Multiplication On Simd Processors - Citegraph

Paper Info

Title
Breaking The Performance Bottleneck Of Sparse Matrix-Vector Multiplication On Simd Processors

Abstract
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse matrix compressed format, a block SpMV algorithm, and a vector write buffer. Experimental results show that our hybrid optimization method can achieve an average speedup of 2.09 over CSR vector kernel for all the matrices. The maximum speedup can go up to 3.24.

Year	DOI	Venue
2013	10.1587/elex.10.20130147	IEICE ELECTRONICS EXPRESS
Keywords	Field	DocType
SpMV, SIMD, CSR, stride-combination CSR with transpose	Bottleneck,Computer science,Sparse matrix-vector multiplication,Parallel computing,SIMD	Journal
Volume	Issue	ISSN
10	9	1349-2543
Citations	PageRank	References
1	0.48	1
Authors
4

Authors (4 rows)

Cited by (1 rows)

References (1 rows)

Name	Order	Citations	PageRank
Kai Zhang	1	1	0.48
Shuming Chen	2	138	38.21
Yaohua Wang	3	44	14.23
Jiang-Hua Wan	4	15	5.86

1