Title
Design and Evaluation of Shared Memory CommunicationBenchmarks on Emerging Architectures using MVAPICH2
Abstract
Recent advances in processor technologies have led to highly multi-threaded and dense multi- and many-core HPC systems. The adoption of such dense multi-core processors is widespread in the Top500 systems. Message Passing Interface (MPI) has been widely used to scale out scientific applications. The communication designs for intra-node communication in MPI are mainly based on shared memory communication. The increased core-density of modern processors warrants the use of efficient shared memory communication designs to achieve optimal performance. While there have been various algorithms and data-structures proposed for the producer-consumer like scenarios in the literature, there is a need to revisit them in the context of MPI communication on modern architectures to find the optimal solutions that work best for modern architectures. In this paper, we first propose a set of low-level benchmarks to evaluate various data-structures such as Lamport queues, Fast-Forward queues, and Fastboxes (FB) for shared memory communication. Then, we bring these designs into the MVAPICH2 MPI library and measure their impact on the MPI intra-node communication for a wide variety of communication patterns. The benchmarking results are carried out on modern multi-/many-core architectures including Intel Xeon CascadeLake and Intel Knights Landing.
Year
DOI
Venue
2019
10.1109/IPDRM49579.2019.00010
2019 IEEE/ACM Third Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM)
Keywords
DocType
ISBN
HPC,MPI,Broadcast,Collective,Multi-endpoints,K-nomial,Algorithms
Conference
978-1-7281-5994-2
Citations 
PageRank 
References 
0
0.34
7
Authors
5
Name
Order
Citations
PageRank
Shulei Xu111.73
Jahanzeb Maqbool Hashmi2427.43
Sourav Chakraborty338149.27
Hari Subramoni446650.51
Dhabaleswar K. Panda55366446.70