Abstract | ||
---|---|---|
The well-known gap between relative CPU speeds and storage bandwidth results in the need for new strategies for managing I/O demands. In large-scale MPI applications, collective I/O has long been an effective way to achieve higher I/O rates, but it poses two constraints. First, although overlapping collective I/O and computation represents the next logical step toward a faster time to solution, MPI's existing collective I/O API provides only limited support for doing so. Second, collective routines (both for I/O and communication) impose a synchronization cost in addition to a communication cost. The upcoming MPI 3.1 standard will provide a new set of nonblocking collective I/O operations to satisfy the need of applications. We present here initial work on the implementation of MPI nonblocking collective I/O operations in the MPICH MPI library. Our implementation begins with the extended two-phase algorithm used in ROMIO's collective I/O implementation. We then utilize a state machine and the extended generalized request interface to maintain the progress of nonblocking collective I/O operations. The evaluation results indicate that our implementation performs as well as blocking collective I/O in terms of I/O bandwidth and is capable of overlapping I/O and other operations. We believe that our implementation can help users try nonblocking collective I/O operations in their applications. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/CCGrid.2015.81 | Cluster Computing and the Grid |
Field | DocType | ISSN |
Synchronization,MPICH,Computer science,Parallel computing,Finite-state machine,Input/output,Bandwidth (signal processing),Benchmark (computing),Computation,Distributed computing,Cloud computing | Conference | 2376-4414 |
Citations | PageRank | References |
1 | 0.37 | 12 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sangmin Seo | 1 | 326 | 20.05 |
Robert Latham | 2 | 134 | 8.57 |
Junchao Zhang | 3 | 133 | 13.02 |
Pavan Balaji | 4 | 1475 | 111.48 |