Title
Multi-threading and one-sided communication in parallel LU factorization
Abstract
Dense LU factorization has a high ratio of computation to communication and, as evidenced by the High Performance Linpack (HPL) benchmark, this property makes it scale well on most parallel machines. Nevertheless, the standard algorithm for this problem has non-trivial dependence patterns which limit parallelism, and local computations require large matrices in order to achieve good single processor performance. We present an alternative programming model for this type of problem, which combines UPC's global address space with lightweight multithreading. We introduce the concept of memory-constrained lookahead where the amount of concurrency managed by each processor is controlled by the amount of memory available. We implement novel techniques for steering the computation to optimize for high performance and demonstrate the scalability and portability of UPC with Teraflop level performance on some machines, comparing favourably to other state-of-the-art MPI codes.
Year
DOI
Venue
2007
10.1145/1362622.1362664
SC
Keywords
Field
DocType
dense linear algebra,latency tolerance,multithreading
Multithreading,Programming paradigm,Supercomputer,Computer science,Concurrency,Parallel computing,Software portability,Concurrent computing,LU decomposition,Distributed computing,Scalability
Conference
ISBN
Citations 
PageRank 
978-1-59593-764-3
13
0.82
References 
Authors
15
2
Name
Order
Citations
PageRank
Parry Husbands156156.37
Katherine A. Yelick23494407.23