Title | ||
---|---|---|
Correlation-Aware Stripe Organization for Efficient Writes in Erasure-Coded Storage: Algorithms and Evaluation |
Abstract | ||
---|---|---|
Erasure coding has been extensively employed for data availability protection in production storage systems by maintaining a low degree of data redundancy. However, how to mitigate the parity update overhead of partial stripe writes in erasure-coded storage systems is still a critical concern. In this paper, we study this problem from two new perspectives: data correlation and stripe organization. We propose $\mathsf{CASO}$CASO, a correlation-aware stripe organization algorithm, which captures data correlation of a data access stream and uses the data correlation characteristics for stripe organization. It packs correlated data into a small number of stripes to reduce the incurred I/Os in partial stripe writes, and further organizes uncorrelated data into stripes to leverage the spatial locality in later access. We implement $\mathsf{CASO}$CASO over Reed-Solomon codes and Azure's Local Reconstruction Codes, and show via extensive trace-driven evaluation that $\mathsf{CASO}$CASO reduces up to 29.8 percent of parity updates and reduces the write time by up to 46.7 percent. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/tpds.2018.2890635 | IEEE Transactions on Parallel and Distributed Systems |
Keywords | Field | DocType |
Encoding,Correlation,Organizations,Distributed databases,Redundancy,Maintenance engineering | Locality,Computer science,Algorithm,Data redundancy,Redundancy (engineering),Distributed database,Erasure code,Data access,Encoding (memory),Erasure | Journal |
Volume | Issue | ISSN |
30 | 7 | 1045-9219 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhirong Shen | 1 | 85 | 18.72 |
Patrick P. C. Lee | 2 | 1295 | 82.50 |
Jiwu Shu | 3 | 709 | 72.71 |
Wenzhong Guo | 4 | 611 | 76.01 |