Title
PM-RAD: An Efficient Restore Algorithm in Deduplication by Pattern Matching
Abstract
Deduplication is one of the most effective and efficient techniques to save memory space. It is widely used in data centers and cloud storage systems. After duplicated chunks are identified and removed, some logically consecutive chunks are physically scattered in different containers, which results in the serious fragmentation problem. The fragmentation problem inevitably leads the restore performance degraded severely. In this paper, we propose an efficient recovery algorithm by using pattern matching to boost the restore performance, which is called PM-RAD. It tries to reduce the number of contain reads by finding read patterns within a looking forwarding window. It also can merge scattered chunks and reads at once; thus it reduces the disk access times. Moreover, we optimize the proposed algorithm in two aspects, the separating caches and the cyclic pattern matching, to reduce disk accesses. During the pattern matching, we split cache into the metadata cache responsible for fingerprints and the data cache for storing chunks. The cyclic pattern matching ensures to find much longer patterns in a continuous sliding window. We implement the proposed algorithm and evaluate it by experiment with various data sets. Experimental results show that our algorithm is superior to the state-of-the-art work in terms of the restore performance.
Year
DOI
Venue
2018
10.1109/HPCC/SmartCity/DSS.2018.00078
2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Keywords
Field
DocType
Deduplication,Restore Algorithm,Pattern Matching
Data deduplication,Metadata,Data set,Sliding window protocol,Cache,Computer science,Algorithm,Fragmentation (computing),Pattern matching,Cloud storage
Conference
ISBN
Citations 
PageRank 
978-1-5386-6615-9
0
0.34
References 
Authors
4
6
Name
Order
Citations
PageRank
Guangping Xu14813.96
Zhang Yi21765194.41
Sheng Lin3263.10
Kai Shi484.99
Quan Yu524123.72
Chi Wan Sung677991.41