Abstract | ||
---|---|---|
The Non-Volatile Memory (NVM) is the promising device to store data and accelerate big data analysis due to its excellent I/O performance. However, we find that simply replacing Hard Disk Drive (HDD) with NVM cannot bring the expected performance improvement. In this paper, we take the data recovery issue in Hadoop File System (HDFS) as a case study to investigate how to take advantage of the performance of NVM. We analyze the data recovery mechanism in HDFS and find that the configuration of replication tasks in the DataNode can affect the data recovery significantly. We conduct extensive analysis and experiments to tuning the configuration and also get some interesting findings. With the new configuration, we increase the data recovery performance improvement from 17% to 71%. At the same time, we can also improve the execution performance of MapReduce tasks to 28% to 59% through optimized configuration. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/ICCCN49398.2020.9209698 | 2020 29th International Conference on Computer Communications and Networks (ICCCN) |
Keywords | DocType | ISSN |
Data Recovery,HDFS,Non-Volatile Memory,Performance Tuning | Conference | 1095-2055 |
ISBN | Citations | PageRank |
978-1-7281-6607-0 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Huijie Li | 1 | 0 | 0.34 |
Xin Li | 2 | 0 | 0.34 |
Youyou Lu | 3 | 356 | 30.81 |
Xiaolin Qin | 4 | 175 | 41.82 |