Title | ||
---|---|---|
Designing MPI Library with Dynamic Connected Transport DCT of InfiniBand: Early Experiences |
Abstract | ||
---|---|---|
The Dynamic Connected DC InfiniBand transport protocol has recently been introduced by Mellanox to address several shortcomings of the older Reliable Connection RC, eXtended Reliable Connection XRC, and Unreliable Datagram UD transport protocols. DC aims to support all of the features provided by RC -- such as RDMA, atomics, and hardware reliability -- while allowing processes to communicate with any remote process with just one DC queue pair QP, like UD. In this paper we present the salient features of the new DC protocol including its connection and communication models.We design new verbs-level collective benchmarks to study the behavior of the new DC transport and understand the performance / memory trade-offs it presents. We then use this knowledge to propose multiple designs for MPI over DC. We evaluate an implementation of our design in the MVAPICH2 MPI library using standard MPI benchmarks and applications. To the best of our knowledge, this is the first such design of an MPI library over the new DC transport. Our experimental results at the microbenchmark level show that the DC-based design in MVAPICH2 is able to deliver 42% and 43% improvement in latency for large message All-to-one exchanges over XRC and RC respectively. DC-based designs are also able to give 20% and 8% improvement for small message One-to-all exchanges over RC and XRC respectively. For the All-to-all communication pattern, DC is able to deliver performance comparable to RC/XRC while outperforming in memory consumption. At the application level, for NAMD on 620 processes, the DC-based designs in MVAPICH2 outperform designs based on RC, XRC, and UD by 22%, 10%, and 13% respectively in execution time. With DL-POLY, DC outperforms RC and XRC by 75% and 30%, respectively, in total completion time while delivering performance similar to UD. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1007/978-3-319-07518-1_18 | ISC |
Keywords | Field | DocType |
high performance computing | InfiniBand,Supercomputer,Latency (engineering),Computer science,Queue,Parallel computing,Discrete cosine transform,Real-time computing,Execution time,Remote direct memory access,Datagram | Conference |
Citations | PageRank | References |
10 | 0.64 | 13 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hari Subramoni | 1 | 466 | 50.51 |
Khaled Hamidouche | 2 | 180 | 19.45 |
Akshay Venkatesh | 3 | 159 | 13.36 |
Sourav Chakraborty | 4 | 381 | 49.27 |
Dhabaleswar K. Panda | 5 | 5366 | 446.70 |