Title
EaD: a Collision-free and High Performance Deduplication Scheme for Flash Storage Systems
Abstract
Inline deduplication is a popular technique to effectively reduce the write traffic and improve the space efficiency for flash-based storage. However, it also introduces computing and memory overhead to generate and store the cryptographic hash (fingerprint). Along the advent of 3D XPoint and Z-NAND technologies with vastly improved latency and bandwidth, both the computing and memory overheads are becoming much more pronounced in deduplication-based flash storage with cryptographic hash functions in use. To address these problems, we propose an ECC (Error Correcting Code) assisted deduplication approach, called EaD, which exploits the ECC property and the asymmetric read-write performance characteristics of modern flash-based storage. EaD first identifies data similarity based on the fingerprints of data chunks represented by their ECC values, thus significantly reducing the costly cryptographic hash computing and alleviating the memory space overhead. Based on the identification results, similar data chunks and their ECCs are read from the flash to perform a byte-by-byte comparison in memory to definitively identify and remove redundant data chunks. Our experiments show that the EaD approach significantly reduces the I/O latency by an average of 1.92× and 1.86×, and reduces the memory consumption by an average of 35.0% and 21.9%, compared with the existing SHA- and sampling-based deduplication approaches, respectively.
Year
DOI
Venue
2020
10.1109/ICCD50377.2020.00039
2020 IEEE 38th International Conference on Computer Design (ICCD)
Keywords
DocType
ISSN
Data Deduplication,Flash Storage,ECC,Collision Free,Performance Evaluation
Conference
1063-6404
ISBN
Citations 
PageRank 
978-1-7281-9711-1
0
0.34
References 
Authors
6
7
Name
Order
Citations
PageRank
Suzhen Wu128323.14
Jindong Zhou200.34
Weidong Zhu333.47
Hong Jiang42137157.96
Zhijie Huang5263.19
Zhirong Shen68518.72
Bo Mao716319.27