Title
Efficient Structured Data Access in Parallel File Systems.
Abstract
Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in provid- ing high-performance I/O solutions for these applica- tions. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs for accessing struc- tured datasets, and the MPI-IO interface also supports ef- ficient access to structured data. However, parallel file systems do not traditionally support such access. In this work we present an implementation of structured data access support in the context of the Parallel Virtual File System (PVFS). We call this support "datatype I/O" because of its similarity to MPI datatypes. This support is built by using a reusable datatype-processing component from the MPICH2 MPI implementation. We describe how this component is leveraged to efficiently process structured data representations resulting from MPI-IO operations. We quantitatively assess the solution using three test applica- tions. We also point to further optimizations in the process- ing path that could be leveraged for even more efficient op- eration.
Year
DOI
Venue
2003
10.1109/CLUSTR.2003.1253331
CLUSTER
Keywords
Field
DocType
data structures,information retrieval,distributed databases,efficiency,parallel processing,message passing,structured data
Hierarchical Data Format,Data structure,Virtual file system,Computer science,Parallel processing,Parallel computing,NetCDF,Distributed database,Data model,Message passing,Distributed computing
Conference
ISBN
Citations 
PageRank 
0-7695-2066-9
41
2.71
References 
Authors
12
5
Name
Order
Citations
PageRank
Avery Ching122116.21
Alok N. Choudhary224222.44
Wei-keng Liao3109587.98
Robert Ross42717173.13
William D. Gropp55547548.31