Title
Achieving High Reliability via Expediting the Repair of Critical Blocks in Replicated Storage Systems.
Abstract
High reliability is critical to large data centers consisting of hundreds to thousands of storage nodes where node failures are not rare. Data replication is a typical technique deployed to achieve high reliability. When a node failure is detected, blocks with lost replicas are identified and recovered. Long timeouts are usually used for node failure detection. For blocks with one lost replica, the long timeouts can significantly reduce network traffic induced by data recovery. However, for blocks with two or more lost replicas, which can be caused by concurrent node failures that are not rare in large data centers, the long timeouts will result in a high risk of loss of these blocks. In this paper, we propose MFR to separate the identification of the blocks with two or more lost replicas from that of the blocks with one lost replica in a way that the identification of the blocks with two or more replicas can be accelerated while that of the blocks with one lost replica stays the same. Consequently, MFR can significantly improve data reliability while keeping the network traffic induced by data recovery stable. The results from our simulation and prototype implementation show that MFR improves the reliability of storage systems by a factor of up to 4.0 in terms of mean time to data loss. As blocks with two or more lost replicas are far fewer than blocks with one lost replica, the extra network traffic caused by MFR is less than 0.54% of total network traffic for data recovery.
Year
DOI
Venue
2016
10.1109/SRDS.2016.16
Symposium on Reliable Distributed Systems Proceedings
Field
DocType
ISSN
Replica,Replication (computing),Computer science,Data reliability,Expediting,Real-time computing,Data recovery,Mean time to data loss,Distributed computing
Conference
1060-9857
Citations 
PageRank 
References 
0
0.34
0
Authors
5
Name
Order
Citations
PageRank
Juntao Fang101.01
Shenggang Wan211210.87
Ping Huang318429.52
Xubin He474763.49
Changsheng Xie5256.27