Cooperative rendezvous protocols for improved performance and overlap. - Citegraph

Paper Info

Title
Cooperative rendezvous protocols for improved performance and overlap.

Abstract
With the emergence of larger multi-/many-core clusters and new areas of HPC applications, performance of large message communication is becoming more important. MPI libraries use different rendezvous protocols to perform large message communication. However, existing rendezvous protocols do not take the overall communication pattern into account or make optimal use of the Sender and the Receiver CPUs. In this work, we propose a cooperative rendezvous protocol that can provide up to 2x improvement in intra-node bandwidth and latency for large messages. We also propose designs to dynamically choose the best rendezvous protocol for each message based on the overall communication pattern. Finally, we show how these improvements can increase the overlap of intra-node communication and computation with inter-node communication and lead to application level benefits at scale. We evaluate the proposed designs on three different architectures - Intel Xeon, Knights Landing, and OpenPOWER against state-of-the-art MPI libraries including MVAPICH2 and Open MPI. Compared to existing designs, the proposed designs show benefits of up to 19% with Graph500, 16% with CoMD, and 10% with MiniGhost.

Year	DOI	Venue
2018	10.1109/SC.2018.00031	SC
Keywords	Field	DocType
Protocols,Receivers,Libraries,Peer-to-peer computing,Hardware,Computer architecture,Runtime	Latency (engineering),Computer science,Peer to peer computing,Computer network,Communication source,Bandwidth (signal processing),Rendezvous,Xeon,Graph500,Computation,Distributed computing	Conference
ISBN	Citations	PageRank
978-1-5386-8384-2	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Sourav Chakraborty	1	381	49.27
M. Bayatpour	2	12	5.43
Jahanzeb Maqbool Hashmi	3	42	7.43
Hari Subramoni	4	466	50.51
Dhabaleswar K. Panda	5	5366	446.70

1