Title | ||
---|---|---|
A High Performance Implementation of Zolo-SVD algorithm on Distributed Memory Systems |
Abstract | ||
---|---|---|
This paper introduces a high performance implementation of the Zolo-SVD algorithm on distributed memory systems, which is based on the polar decomposition (PD) algorithm via the Zolotarev’s function (Zolo-PD), originally proposed by Nakatsukasa and Freund [SIAM Review, 2016]. Our implementation highly relies on the routines of ScaLAPACK and therefore it is portable. Compared with the other PD algorithms such as the QR-based dynamically weighted Halley method (QDWH-PD), Zolo-PD is naturally parallelizable and has better scalability though performs more floating-point operations. When using many processors, Zolo-PD is usually 1.20 times faster than the QDWH-PD algorithm, and Zolo-SVD can be about two times faster than the ScaLAPACK routine PDGESVD. These numerical experiments are performed on Tianhe-2A supercomputer, one of the fastest supercomputers in the world, and the tested matrices include some sparse matrices from particular applications and some randomly generated dense matrices with different dimensions. Our QDWH-SVD and Zolo-SVD implementations are freely available at https://github.com/shengguolsg/Zolo-SVD. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1016/j.parco.2019.04.004 | Parallel Computing |
Keywords | Field | DocType |
ScaLAPACK,Polar decomposition,Zolotarev,QDWH,Distributed parallel algorithm | Parallelizable manifold,Singular value decomposition,Supercomputer,Matrix (mathematics),Computer science,Parallel computing,Algorithm,Polar decomposition,ScaLAPACK,Sparse matrix,Scalability | Journal |
Volume | ISSN | Citations |
86 | 0167-8191 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
ShengGuo Li | 1 | 87 | 10.19 |
Jie Liu | 2 | 7 | 3.53 |
Yunfei Du | 3 | 72 | 14.62 |