Title
HARL: Optimizing Parallel File Systems with Heterogeneity-Aware Region-Level Data Layout.
Abstract
Parallel file system (PFS) is commonly used in high-end computing systems. With the emergence of solid state drives (SSDs), hybrid PFS, which consists of both HDD and SSD servers, provides a practical I/O system solution for data-intensive applications. However, most existing data layout schemes are inefficient for hybrid PFS due to their unawareness of server heterogeneities and workload changes in different parts of a file. In this study, we propose a heterogeneity-aware region-level data layout scheme, HARL, to improve the data distribution of a hybrid PFS. HARL first divides a file into fine-grained, varying sized regions according to the workload features of an application, then determines appropriate file stripe sizes on servers for each region based on the performance of heterogeneous servers. Furthermore, to further improve the performance of a hybrid PFS, we propose a dynamic region-level layout scheme, HARL-D, which creates multiple replicas for each region and redirects file requests to the proper replicas with the lowest access costs at the runtime. Experimental results of representative benchmarks and a real application show that HARL can greatly improve I/O system performance, and demonstrate the advantages of HARL-D over HARL.
Year
DOI
Venue
2017
10.1109/TC.2016.2637905
IEEE Trans. Computers
Keywords
Field
DocType
Layout,Servers,File systems,Solids,Benchmark testing,Computers,Electronic mail
File system,Data layout,Workload,Computer science,Server,Parallel computing,Real-time computing,Solid-state drive,Solid-state,Computing systems,Operating system,Benchmark (computing)
Journal
Volume
Issue
ISSN
66
6
0018-9340
Citations 
PageRank 
References 
0
0.34
29
Authors
4
Name
Order
Citations
PageRank
Shuibing He110920.45
Yang Wang218845.73
Xian-he Sun31987182.64
Z. Chen43443271.62