Title
Metadata Distribution and Consistency Techniques for Large-Scale Cluster File Systems
Abstract
Most supercomputers nowadays are based on large clusters, which call for sophisticated, scalable, and decentralized metadata processing techniques. From the perspective of maximizing metadata throughput, an ideal metadata distribution policy should automatically balance the namespace locality and even distribution without manual intervention. None of existing metadata distribution schemes is designed to make such a balance. We propose a novel metadata distribution policy, Dynamic Dir-Grain (DDG), which seeks to balance the requirements of keeping namespace locality and even distribution of the load by dynamic partitioning of the namespace into size-adjustable hierarchical units. Extensive simulation and measurement results show that DDG policies with a proper granularity significantly outperform traditional techniques such as the Random policy and the Subtree policy by 40 percent to 62 times. In addition, from the perspective of file system reliability, metadata consistency is an equally important issue. However, it is complicated by dynamic metadata distribution. Metadata consistency of cross-metadata server operations cannot be solved by traditional metadata journaling on each server. While traditional two-phase commit (2PC) algorithm can be used, it is too costly for distributed file systems. We proposed a consistent metadata processing protocol, S2PC-MP, which combines the two-phase commit algorithm with metadata processing to reduce overheads. Our measurement results show that S2PC-MP not only ensures fast recovery, but also greatly reduces fail-free execution overheads.
Year
DOI
Venue
2011
10.1109/TPDS.2010.154
IEEE Trans. Parallel Distrib. Syst.
Keywords
DocType
Volume
metadata consistency techniques,random policy,protocols,decentralized metadata processing technique,distributed file systems,subtree policy,traditional metadata,metadata distribution scheme,dynamic metadata distribution,namespace locality,ideal metadata distribution policy,dynamic dir-grain,metadata throughput,large-scale cluster file systems,large-scale cluster distributed file system reliability,metadata distribution,metadata processing,file organisation,s2pc-mp,metadata distribution technique,metadata consistency,meta data,consistent metadata processing protocol,metadata management.,size-adjustable hierarchical units,distributed processing,novel metadata distribution policy,consistency techniques,two-phase commit algorithm,servers,indexing terms,reliability,heuristic algorithm,decision support systems,decision support system,distributed file system
Journal
22
Issue
ISSN
Citations 
5
1045-9219
19
PageRank 
References 
Authors
0.72
24
5
Name
Order
Citations
PageRank
Jin Xiong115715.95
Yiming Hu263944.91
Guojie Li340264.67
Rongfeng Tang4233.35
Zhihua Fan5251.59