Title
GLE-Dedup: A Globally-Locally Even Deduplication by Request-Aware Placement for Better Read Performance.
Abstract
Deduplication serves as a fundamental way to eliminate replicas and save space and network bandwidth in various storage systems. However, the performance of most existing deduplication systems can be further improved on normal reads, which carry crucial weight in currently popular WORM access model. Specifically, most existing deduplication systems achieve globally even layout via the simple round-robin algorithm and ignore the interrelationship between chunks and IO requests in the placement policy, thus failing to achieve the local even placement within a request and causing read imbalance problem. In this paper, we focus on deduplication over small-scale storage systems with adequate bandwidth in between and propose a deduplication system with request-aware placement policy named GLE-Dedup to achieve even placement both globally and locally for better read performance. Differing from conventional approaches of chunk-based placement, GLE-Dedup employs a group placement for chunks and the group size is mainly determined by the request ID to achieve request-awareness. We place chunks belonging to the same IO request into different independent nodes as much as possible to achieve even placement locally within a request and meanwhile maintain global balance with rotation among chunk groups. In this way, better parallelism is exploited for higher read performance. Experiment results under the real-world CAFTL trace have shown the effectiveness and advantage of GLE-Dedup over B-Dedup and R-Dedup respectively under round-robin and random placement. For example, our GLE-Dedup could achieve about 18.9 and 24 % read improvement respectively compared with B-Dedup and R-Dedup.
Year
DOI
Venue
2017
10.1007/s10766-016-0450-5
International Journal of Parallel Programming
Keywords
Field
DocType
Deduplication, Even-placement, Read imbalance, Request-awareness, Group placement
Data deduplication,Computer science,Bandwidth (signal processing),Distributed computing
Journal
Volume
Issue
ISSN
45
4
1573-7640
Citations 
PageRank 
References 
1
0.35
31
Authors
5
Name
Order
Citations
PageRank
Ming-Zhu Deng183.20
Wei Chen28612.45
Nong Xiao3649116.15
Songping Yu484.89
Yupeng Hu5345.58