Title
Dtf: An I/O Arbitration Framework For Multi-Component Data Processing Workflows
Abstract
Multi-component workflows, where one component performs a particular transformation with the data and passes it on to the next component, is a common way of performing complex computations. Using components as building blocks we can apply sophisticated data processing algorithms to large volumes of data. Because the components may be developed independently, they often use file I/O and the Parallel File System to pass data. However, as the data volume increases, file I/O quickly becomes the bottleneck in such workflows. In this work, we propose an I/O arbitration framework called DTF to alleviate this problem by silently replacing file I/O with direct data transfer between the components. DTF treats file I/O calls as I/O requests and performs I/O request matching to perform data movement. Currently, the framework works with PnetCDF-based multi-component workflows. It requires minimal modifications to applications and allows the user to easily control I/O flow via the framework's configuration file.
Year
DOI
Venue
2018
10.1007/978-3-319-92040-5_4
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018
Keywords
Field
DocType
Multi-component workflow, Workflow coupling, I/O performance, I/O arbitration
Bottleneck,File system,Data processing,Data transmission,Computer science,Parallel computing,Input/output,Arbitration,Workflow,Computation
Conference
Volume
ISSN
Citations 
10876
0302-9743
1
PageRank 
References 
Authors
0.36
13
9
Name
Order
Citations
PageRank
Tatiana V. Martsinkevich1312.25
Balazs Gerofi210716.24
Guo-Yuan Lien310.70
Seiya Nishizawa4174.15
Wei-keng Liao5109587.98
takemasa miyoshi6203.62
Hirofumi Tomita7335.67
Yutaka Ishikawa81449188.06
Alok N. Choudhary93441326.32