Title
QuickDedup: Efficient VM deduplication in cloud computing environments
Abstract
Deduplication is one of the major storage optimisation techniques for Virtual Machines (VMs) in cloud environment. Usually, hashing of blocks helps in identifying duplicate data blocks. This paper proposes a novel deduplication approach, QuickDedup that reduces the overall deduplication time, metadata overhead and the number of hash computations, and subsequent comparisons for the VM disk images. In addition to minimising the deduplication related metadata, which is a necessary by-product useful in checking deduplication, QuickDedup, follows novel byte comparison scheme to prepare various block classes. This way, QuickDedup eliminates or minimises the need for hash calculation and subsequent comparisons. QuickDedup performs the calculation and comparisons of hashes within the respective categories only. QuickDedup saves the space required for hash storage during deduplication and makes deduplication of VM disk images much faster. We conducted a detailed evaluation of QuickDedup on various metrics with different kinds and sizes of VM images taken from publicly available datasets. The evaluation results show a substantial improvement of up to 96% in the overall deduplication time required to deduplicate VM images apart from significant savings in metadata and storage overhead.
Year
DOI
Venue
2020
10.1016/j.jpdc.2020.01.002
Journal of Parallel and Distributed Computing
Keywords
Field
DocType
Deduplication,VM disk image,Storage,Hashing performance
Data deduplication,Metadata,Byte,Virtual machine,Computer science,Parallel computing,Hash function,Cloud computing,Computation
Journal
Volume
Issue
ISSN
139
C
0743-7315
Citations 
PageRank 
References 
1
0.34
0
Authors
6
Name
Order
Citations
PageRank
Shweta Saharan110.34
Gaurav Somani217711.85
Gaurav Gupta310.34
Robin Verma472.59
Manoj S. Gaur550163.38
Rajkumar Buyya6232081340.23