Scaling Up Parallel Computation of Tiled QR Factorizations by a Distributed Scheduling Runtime System and Analytical Modeling. - Citegraph

Paper Info

Title
Scaling Up Parallel Computation of Tiled QR Factorizations by a Distributed Scheduling Runtime System and Analytical Modeling.

Abstract
Implementing parallel software for QR factorizations to achieve scalable performance on massively parallel manycore systems requires a comprehensive design that includes algorithm redesign, efficient runtime systems, synchronization and communication reduction, and analytical performance modeling. This paper presents a piece of tiled communication-avoiding QR factorization software that is able to scale efficiently for matrices with general dimensions. We design a tiled communication-avoiding QR factorization algorithm and implement it with a fully distributed dynamic scheduling runtime system to minimize both synchronization and communication. The whole class of communication-avoiding QR factorization algorithms uses an important parameter of D (i.e., the number of domains), whose best solution is still unknown so far and requires manual tuning and empirical searching to find it. To that end, we introduce a simplified analytical performance model to determine an optimal number of domains D*. The experimental results show that our new parallel implementation is faster than a state-of-the-art multicore-based numerical library by up to 30%, and faster than ScaLAPACK by up to 30 times with thousands of CPU cores. Furthermore, using the new analytical model to predict an optimal number of domains is as competitive as exhaustive searching, and exhibits an average performance difference of 1%.

Year	DOI	Venue
2018	10.1142/S0129626418500044	PARALLEL PROCESSING LETTERS
Keywords	DocType	Volume
High performance computing,numerical libraries,analytical performance modeling	Journal	28
Issue	ISSN	Citations
1	0129-6264	0
PageRank	References	Authors
0.34	3	4

Authors (4 rows)

Cited by (0 rows)

References (3 rows)

Name	Order	Citations	PageRank
Weijian Zheng	1	0	1.69
Fengguang Song	2	2	2.42
Lan Lin	3	4	8.21
Zizhong Chen	4	924	69.93

1