Title
Distributed Fault Tolerant Linear System Solvers Based On Erasure Coding
Abstract
We present efficient coding schemes and distributed implementations of erasure coded linear system solvers. Erasure coded computations belong to the class of algorithmic fault tolerance schemes. They are based on augmenting an input dataset, executing the algorithm on the augmented dataset, and in the event of a fault, recovering the solution from the corresponding augmented solution. This process can be viewed as the computational analog of erasure coded storage schemes. The proposed technique has a number of important benefits: (i) as the hardware platform scales in size and number of faults, our scheme yields increasing improvement in resource utilization, compared to traditional schemes; (ii) the proposed scheme is easy to code the core algorithms remain the same; and (iii) the general scheme is flexible accommodating a range of computation and communication tradeoffs. We present new coding schemes for augmenting the input matrix that satisfy the recovery equations of erasure coding with high probability in the event of random failures. These coding schemes also minimize fill (non-zero elements introduced by the coding block), while being amenable to efficient partitioning across processing nodes. We demonstrate experimentally that our scheme adds minimal overhead for fault tolerance, yields excellent parallel efficiency and scalability, and is robust to different fault arrival models.
Year
DOI
Venue
2017
10.1109/ICDCS.2017.261
2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017)
Keywords
Field
DocType
Fault Tolerance, Erasure Coding, Linear System Solvers, Kruskal rank
Linear system,Computer science,Coding (social sciences),Fault tolerance,Erasure code,Sparse matrix,Erasure,Scalability,Encoding (memory),Distributed computing
Conference
ISSN
Citations 
PageRank 
1063-6927
1
0.35
References 
Authors
13
4
Name
Order
Citations
PageRank
Xuejiao Kang110.35
David F. Gleich291957.23
A. H. Sameh3562212.82
Ananth Grama41812136.25