Title
A Feasibility Study for MPI over HDFS
Abstract
With the increasing prominence of integrating highperformance computing (HPC) with big-data (BIGDATA) processing, running MPI over the Hadoop Distributed File System (HDFS) offers a promising approach for delivering better scalability and fault tolerance to traditional HPC applications. However, it comes with challenges that discourage such an approach: (1) two-sided MPI communication to support intermediate data processing, (2) a focus on enabling N-1 writes that is subject to the default HDFS block-placement policy, and (3) a pipelined writing mode in HDFS that cannot fully utilize the underlying HPC hardware. So, while directly integrating MPI with HDFS may deliver better scalability and fault tolerance to MPI applications, it will fall short of delivering competitive performance. Consequently, we present a performance study to evaluate the feasibility of integrating MPI applications to run over HDFS. Specifically, we show that by aggregating and reordering intermediate data and coordinating computation and 110 when running MPI over HDFS, we can deliver up to 1.92x and 1.78x speedup over MPI I/O and HDFS pipelined-write implementations, respectively. Consequently, we present a performance study to evaluate the feasibility of integrating MPI applications to run over HDFS. Specifically, we show that by aggregating and reordering intermediate data and coordinating computation and 110 when running MPI over HDFS, we can deliver up to 1.92x and 1.78x speedup over MPI I/O and HDFS pipelined-write implementations, respectively.
Year
DOI
Venue
2020
10.1109/HPEC43674.2020.9286250
2020 IEEE High Performance Extreme Computing Conference (HPEC)
Keywords
DocType
ISSN
fault tolerance,integrating MPI applications,1.78x speedup,integrating highperformance computing,big-data processing,traditional HPC applications,MPI communication,intermediate data processing,default HDFS block-placement policy
Conference
2377-6943
ISBN
Citations 
PageRank 
978-1-7281-9220-8
0
0.34
References 
Authors
12
6
Name
Order
Citations
PageRank
Wu-chun Feng12812232.50
Zhangdui Zhong21577177.76
J. Zhang300.34
Hou-Kuei Huang401.01
S. Pumma500.34
H. Wang600.34