Title
Communication-Aware Load Balancing of the LU Factorization over Heterogeneous Clusters
Abstract
Supercomputers are designed to be as homogeneous as possible but it is common that a few nodes exhibit variable performance capabilities due to processor manufacturing. It is also common to find partitions equipped with different types of accelerators. Data distribution over heterogeneous nodes is very challenging but essential to exploit all resources efficiently. In this article, we build upon task-based runtimes' flexibility of managing data to study the interplay between static communication-aware data distribution strategies and dynamic scheduling of the linear algebra LU factorization over heterogeneous sets of hybrid nodes. We propose two techniques derived from the state-of-the-art 1D×1D data distributions. First, to use fewer computing nodes towards the end to better match performance bounds and save computing power. Second, to carefully move a few blocks between nodes to optimize even further the load balancing among nodes. We also demonstrate how 1D×1D data distributions, tailored for heterogeneous nodes, can scale better with homogeneous clusters than classical block-cyclic distributions. Validation is carried out both in real and in simulated environments under homogeneous and heterogeneous platforms, demonstrating compelling performance improvements.
Year
DOI
Venue
2020
10.1109/ICPADS51040.2020.00017
2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS)
Keywords
DocType
ISSN
Data Partitioning,LU Factorization,Load Balancing,Task-Based Applications,Heterogeneous Clusters
Conference
1521-9097
ISBN
Citations 
PageRank 
978-1-7281-8382-4
1
0.36
References 
Authors
0
3
Name
Order
Citations
PageRank
Lucas Nesi110.36
Lucas Mello Schnorr29615.10
Arnaud Legrand3151191.94