Title
Correlation-Aware Stripe Organization for Efficient Writes in Erasure-Coded Storage Systems
Abstract
Erasure coding has been extensively employed for data availability protection in production storage systems by maintaining a low degree of data redundancy. However, how to mitigate the parity update overhead of partial stripe writes in erasure-coded storage systems is still a critical concern. In this paper, we reconsider this problem from two new perspectives: data correlation and stripe organization, and propose CASO, a correlation-aware stripe organization algorithm. CASO captures data correlation of a data access stream. It packs correlated data into a small number of stripes to reduce the incurred I/Os in partial stripe writes, and further organizes uncorrelated data into stripes to leverage the spatial locality in later accesses. By differentiating correlated and uncorrelated data in stripe organization, we show via extensive trace-driven evaluation that CASO reduces up to 25.1% of parity updates and accelerates the write speed by up to 28.4%.
Year
DOI
Venue
2017
10.1109/SRDS.2017.18
2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)
Keywords
Field
DocType
erasure coding,data availability protection,production storage systems,data redundancy,parity update overhead,partial stripe,correlation-aware stripe organization algorithm,data access stream,parity updates,CASO,uncorrelated data
Locality,Computer science,Real-time computing,Redundancy (engineering),Correlation,Data redundancy,Erasure code,Data access,Erasure,Encoding (memory),Distributed computing
Conference
ISSN
ISBN
Citations 
1060-9857
978-1-5386-1680-2
1
PageRank 
References 
Authors
0.35
24
4
Name
Order
Citations
PageRank
Zhirong Shen18518.72
Patrick P. C. Lee2129582.50
Jiwu Shu370972.71
Wenzhong Guo461176.01