Title
Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols
Abstract
Message Passing Interface (MPI) point-to-point communications are usually realized with two protocols, the eager protocol for small messages and the rendezvous protocol for medium and large sized messages. Traditional sender-initiated rendezvous protocols are sub-optimal in many situations. In this work, we propose to refine the rendezvous protocol for medium and large messages on RDMA-enabled clusters with three protocols that are customized for different situations, a hybrid protocol for medium sized messages when the sender arrives early, a sender-initiated protocol for large messages when the sender arrives early, and a receiver-initiated protocol when the receiver arrives early. In comparison to traditional sender-initiated rendezvous protocols, the proposed scheme reduces unnecessary synchronizations, decreases the number of control messages that are in the critical path of communications, and improves the communication progress, which results in a significantly better communication-computation overlap capability. We present and analyze these protocols, and describe how these protocols and the eager protocol can be seamlessly integrated in one system without introducing an excessive number of control messages. We have implemented the proposed scheme for InfiniBand clusters. The experimental results demonstrate the effectiveness of the proposed technique.
Year
DOI
Venue
2009
10.1145/1542275.1542320
I4CS
Keywords
Field
DocType
large message,sender-initiated protocol,customized protocol,rendezvous protocol,receiver-initiated protocol,eager protocol,traditional sender-initiated rendezvous protocol,proposed technique,rdma-enabled cluster,proposed scheme,hybrid protocol,control message,maximizing mpi point-to-point communication,point to point,col,critical path,rdma,message passing interface,mpi
InfiniBand,Computer science,Parallel computing,Communication source,Real-time computing,Message Passing Interface,Remote direct memory access,Rendezvous,Point-to-point,Critical path method,Distributed computing,Link Control Protocol
Conference
Citations 
PageRank 
References 
5
0.47
11
Authors
2
Name
Order
Citations
PageRank
Matthew Small1122.06
Xin Yuan2108992.27