Title
DeepHash: An End-to-End Learning Approach for Metadata Management in Distributed File Systems
Abstract
In distributed file systems, distributed metadata management can be considered as a mapping problem, i.e., how to effectively map the metadata namespace tree to multiple metadata servers (MDS's). In general, all traditional distributed metadata management schemes simply presume a rigid mapping function, thus failing to adaptively meet the requirements of different applications. To better take advantage of the current distribution of the metadata, in this exploratory paper, we present the first machine learning based model called DeepHash, which leverages the deep neural network to learn a locality preserving hashing (LPH) mapping. To help learn a good position relationship of metadata nodes in the namespace tree, we first present a metadata representation strategy. Due to the absence of training labels, i.e., the hash values of metadata nodes, we design two kinds of loss functions with distinctive characters to train DeepHash respectively, including a pair loss and a triplet loss, and introduce some sampling strategies for these two approaches. We conduct extensive experiments on Amazon EC2 platform to compare the performance of DeepHash with traditional and state-of-the-art schemes. The results demonstrate that DeepHash can preserve the metadata locality well while maintaining a high load balancing, which denotes the effectiveness and efficiency of DeepHash.
Year
DOI
Keywords
2019
10.1145/3337821.3337924
distributed file system, locality preserving hashing, metadata management, neural network
Field
DocType
ISSN
End-to-end principle,Computer science,Multimedia,Metadata management,Distributed computing
Conference
978-1-4503-6295-5
ISBN
Citations 
PageRank 
978-1-4503-6295-5
1
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Yuanning Gao142.73
Xiaofeng Gao271398.58
guihai chen33537317.28