Title
Optimizing the restoration performance of deduplication systems through an energy-saving data layout
Abstract
While data deduplication is an important data compression technique that removes copies of repeated data to enhance storage utilization, security and privacy risks arise since sensitive or delicate user data are at risk to both insider and outsider attacks. A distinct negative factor to performance of the technique is data fragmentation, which not only slows down the restoration process but also leads to massive power consumption. In this paper, we address this problem from the perspective of data layout. The kernel point of our method is a novel RAID-5-based cross grouping data layout (CGDL). We introduce a selective deduplication algorithm (SDD) to perform data replication and restoration. A new CGDL-based disk scheduling algorithm (LDP) is also proposed that predicts location dependence to save energy by eliminating the redundant disk read/write operations. We evaluate our new method on the Linux MD (multiple device) driver modules. The experiments show that, under a 10 disks 3 groups storage configuration, our method drastically (by 20%) improves restoration efficiency with only 7.6% reduction on the deduplication ratio, while reducing 23% power consumption.
Year
DOI
Venue
2019
10.1007/s12243-019-00711-z
Annals of Telecommunications
Keywords
Field
DocType
Data deduplication, Data layout, Data restoration, Energy saving
Data deduplication,Kernel (linear algebra),Replication (computing),Data layout,I/O scheduling,Fragmentation (computing),Electronic engineering,Data compression,Computer engineering,Mathematics,Power consumption
Journal
Volume
Issue
ISSN
74
7
0003-4347
Citations 
PageRank 
References 
0
0.34
27
Authors
6
Name
Order
Citations
PageRank
Fang Yan183.23
Xi Yang200.34
Jiamou Liu34923.19
Hengliang Tang48213.91
Yu-an Tan516318.37
Yuan-Zhang Li612012.63