Title
Research of Massive Small Files Reading Optimization Based on Parallel Network File System
Abstract
With the rapid development of cloud computing and big data, there are more and more small files. How to manage those massive small files efficiently and provide low-latency service is becoming a hot topic in Parallel Network File System (pNFS). When reading massive small files in pNFS, because metadata access frequency is fairly high, and disk efficiency is rather low, massive small file access performance is far lower than large file access performance. This paper presents an optimization mechanism for reading small files, including extended read dir delegation, radically metadata pre-read technology and large IO data pre-read technology between small files. These optimizations could significantly reduce the reading access latency and make full use of the client cache. The effectiveness of this optimization is proved with intensive experiments, when reading massive small files, compared with pNFS, the performance of metadata reading is 1959% higher, sequential data reading is 2436% higher, the random data reading performance is 1675% higher, and the overall performance is 1767% higher.
Year
DOI
Venue
2015
10.1109/HPCC-CSS-ICESS.2015.97
HPCC/CSS/ICESS
Keywords
Field
DocType
Small files, pre-read, pNFS, read optimization
Metadata,Cache,Computer science,Server,Throughput,Data file,Big data,Operating system,Database,Network File System,Cloud computing
Conference
ISSN
Citations 
PageRank 
2576-3504
0
0.34
References 
Authors
7
5
Name
Order
Citations
PageRank
Yang Hongzhang100.34
Junwei Zhang2469.28
Xiangchao Zeng300.34
Huanqing Dong441.77
Lu Xu5166.54