Title
Improving Restore Performance of Deduplication Systems by Leveraging the Chunk Sequence in Backup Stream.
Abstract
Traditional deduplication based backup systems normally employ containers to reduce the chunk fragmentation, thus improving the restore performance. However, the shared chunks belonging to a single backup grows with the increase of the number of backups. Those shared chunks are normally distributed across multiple containers. This feature increases chunk fragmentation and significantly degrades the restore performance. In order to improve the restore performance, some schemes are proposed to optimize the replacement strategy of the restore cache, such as the ones using LRU and OPT. However, LRU is inefficient and OPT consumes additional computational overhead. By analyzing the backup and restore process, we observe that the sequence of the chunks in the backup stream is consistent to that in the restore stream. Based on this observation, this paper proposes an off-line optimal replacement strategy—OFL for the restore cache. The OFL records the chunk sequence of backup process, and then uses this sequence to calculate the exact information of the required chunks in advance for the restore process. Finally, accurate prefetch will be employed by leveraging the above information to reduce the impact of chunk fragmentation. Real data sets are employed to evaluate the proposed OFL. The experimental results demonstrate that OFL improves the restore performance over 8% in contrast to the traditional LRU and OPT.
Year
Venue
Field
2018
ICA3PP
Data deduplication,Overhead (computing),Computer science,Cache,Instruction prefetch,Backup,Distributed computing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
20
4
Name
Order
Citations
PageRank
Ru Yang152.41
Yuhui Deng233139.56
Cheng Hu3164.65
Lei Si400.68