Title
Memory efficient sanitization of a deduplicated storage system
Abstract
Sanitization is the process of securely erasing sensitive data from a storage system, effectively restoring the system to a state as if the sensitive data had never been stored. Depending on the threat model, sanitization could require erasing all unreferenced blocks. This is particularly challenging in deduplicated storage systems because each piece of data on the physical media could be referred to by multiple namespace objects. For large storage systems, where available memory is a small fraction of storage capacity, standard techniques for tracking data references will not fit in memory, and we discuss multiple sanitization techniques that trade-off I/O and memory requirements. We have three key contributions. First, we provide an understanding of the threat model and what is required to sanitize a deduplicated storage system as compared to a device. Second, we have designed a memory efficient algorithm using perfect hashing that only requires from 2.54 to 2.87 bits per reference (98% savings) while minimizing the amount of I/O. Third, we present a complete sanitization design for EMC Data Domain.
Year
Venue
Keywords
2013
FAST
data reference,deduplicated storage system,memory efficient sanitization,sensitive data,storage system,memory requirement,storage capacity,available memory,memory efficient algorithm,large storage system,threat model
Field
DocType
Citations 
Data domain,Computer data storage,Threat model,Computer science,Tracking data,Physical media,Namespace,Perfect hash function,Database
Conference
15
PageRank 
References 
Authors
0.59
30
4
Name
Order
Citations
PageRank
Fabiano C. Botelho117411.06
Philip Shilane2100041.22
Nitin Garg317915.56
Windsor Hsu42118.10