Title
Designing MPI Library with Dynamic Connected Transport DCT of InfiniBand: Early Experiences
Abstract
The Dynamic Connected DC InfiniBand transport protocol has recently been introduced by Mellanox to address several shortcomings of the older Reliable Connection RC, eXtended Reliable Connection XRC, and Unreliable Datagram UD transport protocols. DC aims to support all of the features provided by RC -- such as RDMA, atomics, and hardware reliability -- while allowing processes to communicate with any remote process with just one DC queue pair QP, like UD. In this paper we present the salient features of the new DC protocol including its connection and communication models.We design new verbs-level collective benchmarks to study the behavior of the new DC transport and understand the performance / memory trade-offs it presents. We then use this knowledge to propose multiple designs for MPI over DC. We evaluate an implementation of our design in the MVAPICH2 MPI library using standard MPI benchmarks and applications. To the best of our knowledge, this is the first such design of an MPI library over the new DC transport. Our experimental results at the microbenchmark level show that the DC-based design in MVAPICH2 is able to deliver 42% and 43% improvement in latency for large message All-to-one exchanges over XRC and RC respectively. DC-based designs are also able to give 20% and 8% improvement for small message One-to-all exchanges over RC and XRC respectively. For the All-to-all communication pattern, DC is able to deliver performance comparable to RC/XRC while outperforming in memory consumption. At the application level, for NAMD on 620 processes, the DC-based designs in MVAPICH2 outperform designs based on RC, XRC, and UD by 22%, 10%, and 13% respectively in execution time. With DL-POLY, DC outperforms RC and XRC by 75% and 30%, respectively, in total completion time while delivering performance similar to UD.
Year
DOI
Venue
2014
10.1007/978-3-319-07518-1_18
ISC
Keywords
Field
DocType
high performance computing
InfiniBand,Supercomputer,Latency (engineering),Computer science,Queue,Parallel computing,Discrete cosine transform,Real-time computing,Execution time,Remote direct memory access,Datagram
Conference
Citations 
PageRank 
References 
10
0.64
13
Authors
5
Name
Order
Citations
PageRank
Hari Subramoni146650.51
Khaled Hamidouche218019.45
Akshay Venkatesh315913.36
Sourav Chakraborty438149.27
Dhabaleswar K. Panda55366446.70