Title
Block Red-Black Milu(0) Preconditioner With Relaxation On Gpu
Abstract
To accelerate the Krylov subspace-based linear equation solvers on Graphics Processing Units (GPUs), a stable, efficient and highly parallel preconditioner is essential. One of the strong candidates for such a preconditioner is the combination of the block red?black ordering and the relaxed modified incomplete LU factorization without fill-ins (MILU(0)). In this paper, we present techniques for implementing this type of preconditioner on General-purpose computing on GPU (GPGPU) using OpenACC. Our implementation is designed for 3-dimensional finite-difference computations with 7-point stencil, and the matrix storage format is optimized to realize coalesced memory access. Also, mixed-precision computation is employed to exploit the high single-precision performance of GPUs without sacrificing the accuracy of the computed solution. Extensive numerical tests were performed and the optimal values of various tunable parameters such as the number of blocks in each direction and the number of workers specified in OpenACC clauses are discussed. Performance comparison on NVIDIA Quadro GP100 and Tesla K40t GPUs shows that our solver is much faster than existing libraries like cuSPARSE, MAGMA, ViennaCL, and Ginkgo, especially when multiple linear equations with coefficient matrices sharing the same nonzero pattern are solved.
Year
DOI
Venue
2021
10.1016/j.parco.2021.102760
PARALLEL COMPUTING
Keywords
DocType
Volume
Sparse matrix, Preconditioning, Modified ILU, BiCGSTAB solver, GPGPU computing
Journal
103
ISSN
Citations 
PageRank 
0167-8191
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Akemi Shioya100.34
Yusaku Yamamoto25220.61