Title
An Experimental Study on Data Recovery Performance Improvement for HDFS with NVM
Abstract
The Non-Volatile Memory (NVM) is the promising device to store data and accelerate big data analysis due to its excellent I/O performance. However, we find that simply replacing Hard Disk Drive (HDD) with NVM cannot bring the expected performance improvement. In this paper, we take the data recovery issue in Hadoop File System (HDFS) as a case study to investigate how to take advantage of the performance of NVM. We analyze the data recovery mechanism in HDFS and find that the configuration of replication tasks in the DataNode can affect the data recovery significantly. We conduct extensive analysis and experiments to tuning the configuration and also get some interesting findings. With the new configuration, we increase the data recovery performance improvement from 17% to 71%. At the same time, we can also improve the execution performance of MapReduce tasks to 28% to 59% through optimized configuration.
Year
DOI
Venue
2020
10.1109/ICCCN49398.2020.9209698
2020 29th International Conference on Computer Communications and Networks (ICCCN)
Keywords
DocType
ISSN
Data Recovery,HDFS,Non-Volatile Memory,Performance Tuning
Conference
1095-2055
ISBN
Citations 
PageRank 
978-1-7281-6607-0
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Huijie Li100.34
Xin Li200.34
Youyou Lu335630.81
Xiaolin Qin417541.82