Title
Asynchronous and multithreaded communications on irregular applications using vectorized divide and conquer approach.
Abstract
The evolution of hardware architectures driven by the increasing requirement for performance and energy efficiency has led to complex HPC systems. In the context of Finite Element Methods, exposing massive parallelism on unstructured mesh computations with efficient load balancing and minimal synchronizations is challenging. Several parallelization strategies have to be combined together to exploit the multiple levels of parallelism. We propose several contributions aimed at addressing irregular codes and data structures in an efficient way. We have developed a hybrid parallelization approach based on the Divide & Conquer (D&C) principle which combines the distributed, shared, and vectorial forms of parallelism in a fine grain task-based parallelism approach applied to irregular structures. We experiment our approach using a matrix assembly step of an industrial application from Dassault Aviation on standard Xeon multicores and Xeon Phi KNC manycores. On 512 Intel Xeon E5-2670 Sandy Bridge cores, we surpass the pure MPI approach by up to 3.47× and reach 77% of parallel efficiency using only 2000 vertices per core. On 4 Xeon Phi 5110p KNC, D&C has similar performance to 96 Intel Xeon E5-2670 Sandy Bridge cores; it achieves an excellent parallel efficiency of 96%, and up to 6.56× speedup compared to pure MPI.
Year
DOI
Venue
2018
10.1016/j.jpdc.2017.12.004
Journal of Parallel and Distributed Computing
Keywords
Field
DocType
00-01,99-00
Asynchronous communication,Data structure,Load balancing (computing),Efficient energy use,Computer science,Massively parallel,Parallel computing,Exploit,Divide and conquer algorithms,Speedup
Journal
Volume
ISSN
Citations 
114
0743-7315
0
PageRank 
References 
Authors
0.34
24
2
Name
Order
Citations
PageRank
Loïc Thébault1132.29
Eric Petit25812.73