Abstract | ||
---|---|---|
With the emergence of larger multi-/many-core clusters and new areas of HPC applications, performance of large message communication is becoming more important. MPI libraries use different rendezvous protocols to perform large message communication. However, existing rendezvous protocols do not take the overall communication pattern into account or make optimal use of the Sender and the Receiver CPUs. In this work, we propose a cooperative rendezvous protocol that can provide up to 2x improvement in intra-node bandwidth and latency for large messages. We also propose designs to dynamically choose the best rendezvous protocol for each message based on the overall communication pattern. Finally, we show how these improvements can increase the overlap of intra-node communication and computation with inter-node communication and lead to application level benefits at scale. We evaluate the proposed designs on three different architectures - Intel Xeon, Knights Landing, and OpenPOWER against state-of-the-art MPI libraries including MVAPICH2 and Open MPI. Compared to existing designs, the proposed designs show benefits of up to 19% with Graph500, 16% with CoMD, and 10% with MiniGhost. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/SC.2018.00031 | SC |
Keywords | Field | DocType |
Protocols,Receivers,Libraries,Peer-to-peer computing,Hardware,Computer architecture,Runtime | Latency (engineering),Computer science,Peer to peer computing,Computer network,Communication source,Bandwidth (signal processing),Rendezvous,Xeon,Graph500,Computation,Distributed computing | Conference |
ISBN | Citations | PageRank |
978-1-5386-8384-2 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sourav Chakraborty | 1 | 381 | 49.27 |
M. Bayatpour | 2 | 12 | 5.43 |
Jahanzeb Maqbool Hashmi | 3 | 42 | 7.43 |
Hari Subramoni | 4 | 466 | 50.51 |
Dhabaleswar K. Panda | 5 | 5366 | 446.70 |