Title
Auto-Tuning GEMV on Many-Core GPU
Abstract
GPUs provide powerful computing ability especially for data parallel algorithms. However, the complexity of the GPU system makes the optimization of even a simple algorithm difficult. Different parallel algorithms or optimization methods on a GPU often lead to very different performances. The matrix-vector multiplication routine for general dense matrices (GEMV) is a building block for many scientific and engineering computations. We find that the implementations of GEMV in CUBLAS 4.0 or MAGMA are not efficient, especially for small matrix or fat matrix (a matrix with small number of rows and large number of columns). In this paper, we propose two new algorithms to optimize GEMV on Fermi GPU. Instead of using only one thread, we use a warp to compute an element of vector y. We also propose a novel register blocking method to accelerate GEMV on GPU further. The proposed optimization methods for GEMV are comprehensively evaluated on the matrices with different sizes. Experiment results show that the new methods can achieve over 10x speedup for small square matrices and fat matrices compared to CUBLAS 4.0 or MAGMA, and the new register blocking method can also perform better than CUBLAS 4.0 or MAGMA for large square matrices. We also propose a performance-tuning framework on how to choose an optimal algorithm of GEMV for an arbitrary input matrix on GPU.
Year
DOI
Venue
2012
10.1109/ICPADS.2012.15
ICPADS
Keywords
Field
DocType
different parallel algorithm,fat matrix,fermi gpu,arbitrary input matrix,many-core gpu,auto-tuning gemv,large square matrix,different performance,general dense matrix,gpu system,small square matrix,small matrix,vectors,matrix multiplication,parallel algorithms
Parallel algorithm,Computer science,Matrix (mathematics),Parallel computing,Square matrix,Thread (computing),Multiplication,Computational science,Matrix multiplication,Performance tuning,Distributed computing,Speedup
Conference
ISSN
Citations 
PageRank 
1521-9097
3
0.44
References 
Authors
3
8
Name
Order
Citations
PageRank
XU Wei-Zhi1368.65
Zhiyong Liu265975.59
Jun Wu330.44
Xiaochun Ye412528.41
JIAO Shuai5303.38
Da Wang6448.79
Fenglong Song7669.09
FAN Dong-Rui822238.18