Title
IndexFS: scaling file system metadata performance with stateless caching and bulk insertion
Abstract
The growing size of modern storage systems is expected to exceed billions of objects, making metadata scalability critical to overall performance. Many existing distributed file systems only focus on providing highly parallel fast access to file data, and lack a scalable metadata service. In this paper, we introduce a middleware design called IndexFS that adds support to existing file systems such as PVFS, Lustre, and HDFS for scalable high-performance operations on metadata and small files. IndexFS uses a table-based architecture that incrementally partitions the namespace on a per-directory basis, preserving server and disk locality for small directories. An optimized log-structured layout is used to store metadata and small files efficiently. We also propose two client-based storm-free caching techniques: bulk namespace insertion for creation intensive workloads such as N-N checkpointing; and stateless consistent metadata caching for hot spot mitigation. By combining these techniques, we have demonstrated IndexFS scaled to 128 metadata servers. Experiments show our out-of-core metadata throughput out-performing existing solutions such as PVFS, Lustre, and HDFS by 50% to two orders of magnitude.
Year
DOI
Venue
2014
10.1109/SC.2014.25
SC
Keywords
Field
DocType
middleware design,file system metadata performance scaling,high-performance operations,disk locality,lustre,checkpointing,stateless consistent metadata caching,metadata scalability,cache storage,out-of-core metadata throughput,log-structured merge tree,file system metadata,namespace partitioning,pvfs,bulk insertion,client-based storm free caching techniques,preserving server,log-structured layout optimization,indexfs,stateless caching,middleware,hot spot mitigation,meta data,storage systems,n-n check pointing,creation intensive workloads,bulk namespace insertion,table-based architecture,distributed file systems,per-directory basis,hdfs,log structured merge tree
Metadata,File system,Global Namespace,Computer science,Parallel computing,Server,Log-structured merge-tree,Journaling file system,Namespace,Operating system,Storage Resource Broker,Distributed computing
Conference
ISSN
Citations 
PageRank 
2167-4329
40
1.08
References 
Authors
34
4
Name
Order
Citations
PageRank
Kai Ren122912.85
Qing Zheng2915.40
Swapnil Patil330618.05
Garth A. Gibson42517250.27