Title
DAM: A DataOwnership-Aware Multi-layered De-duplication Scheme
Abstract
Beyond the storage savings brought by chunk-level de-duplication in backup and archiving systems, a prominent challenge facing this technology is how to efficiently and effectively identify the duplicate chunks. Most of the chunk fingerprints used to identify individual chunks are stored on disks due to the limited main memory capacity. Checking for chunk fingerprint match on disk for every input chunk is known to be a severe performance bottleneck for the backup process. On the other hand, our intuitions and analyses of real backup data both indicate that duplicate chunks tend to strongly concentrate according to the data ownership. Motivated by this observation and to avoid or alleviate the aforementioned backup performance bottleneck, we propose DAM, a dataownership-aware multi-layered de-duplication scheme that exploits the data chunks' ownership and uses a tri-layered de-duplication approach to narrow the search space for duplicate chunks to reduce the total disk accesses. Our experimental results with real world datasets on DAM show it reduces the disk accesses by an average of 60.8% and shortens the de-duplication time by an average of 46.3%.
Year
DOI
Venue
2010
10.1109/NAS.2010.57
NAS
Keywords
Field
DocType
dam,backup,de-duplication time,dataownership-aware multi-layered de-duplication scheme,disk accesses,storage management,real backup data,chunk-level deduplication,chunk fingerprint match,chunk fingerprint,data compression,data ownership-aware multilayered deduplication scheme,backup process,duplicate chunk,chunk fingerprints,tri-layered de-duplication approach,archiving systems,de-duplication,disk access,aforementioned backup performance bottleneck,main memory capacity,chunk-level de-duplication,servers,search space,data structures,redundancy,noise measurement,de duplication
Data deduplication,Bottleneck,Data structure,Computer science,Server,Computer network,Exploit,Real-time computing,Redundancy (engineering),Data compression,Backup
Conference
ISBN
Citations 
PageRank 
978-1-4244-8133-0
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Yujuan Tan113823.48
Dan Feng21845188.16
Zhichao Yan38210.60
Guohui Zhou47329.90