Title
Deduplication in SSDs: Model and quantitative analysis
Abstract
In NAND Flash-based SSDs, deduplication can provide an effective resolution of three critical issues: cell lifetime, write performance, and garbage collection overhead. However, deduplication at SSD device level distinguishes itself from the one at enterprise storage systems in many aspects, whose success lies in proper exploitation of underlying very limited hardware resources and workload characteristics of SSDs. In this paper, we develop a novel deduplication framework elaborately tailored for SSDs. We first mathematically develop an analytical model that enables us to calculate the minimum required duplication rate in order to achieve performance gain given deduplication overhead. Then, we explore a number of design choices for implementing deduplication components by hardware or software. As a result, we propose two acceleration techniques: sampling-based filtering and recency-based fingerprint management. The former selectively applies deduplication based upon sampling and the latter effectively exploits limited controller memory while maximizing the deduplication ratio. We prototype the proposed deduplication framework in three physical hardware platforms and investigate deduplication efficiency according to various CPU capabilities and hardware/software alternatives. Experimental results have shown that we achieve the duplication rate ranging from 4% to 51%, with an average of 17%, for the nine workloads considered in this work. The response time of a write request can be improved by up to 48% with an average of 15%, while the lifespan of SSDs is expected to increase up to 4.1 times with an average of 2.4 times.
Year
DOI
Venue
2012
10.1109/MSST.2012.6232379
MSST
Keywords
Field
DocType
nand circuits,workload characteristics,model analysis,recency-based fingerprint management,deduplication efficiency,write performance,acceleration technique,storage management,cpu capability,deduplication framework,garbage collection overhead,cell lifetime,sampling-based filtering,quantitative analysis,filtering theory,nand flash-based ssd,hardware-software alternative,flash memories,generators,acceleration,garbage collection,hardware,storage system
Data deduplication,Control theory,Flash memory,Computer science,Response time,Exploit,NAND gate,Software,Garbage collection,Operating system,Embedded system
Conference
ISSN
ISBN
Citations 
2160-195X E-ISBN : 978-1-4673-1746-7
978-1-4673-1746-7
29
PageRank 
References 
Authors
0.95
18
10
Name
Order
Citations
PageRank
Jonghwa Kim162346.51
ChoongHyun Lee2562.43
Sang Yup Lee320719.79
Ikjoon Son4290.95
Jongmoo Choi5584.50
Sungroh Yoon656678.80
Hu-ung Lee7301.65
Sooyong Kang826126.47
Youjip Won955854.71
Jaehyuk Cha1016713.66