Abstract | ||
---|---|---|
Abstract: In this paper, we extend the theory and practice regarding algorithmic fault-tolerant matrix-matrix multiplication, C = AB, in a number of ways. First, we propose low-overhead methods for detecting errors introduced not only in C but also in A and/or B. Second, we show that, theoretically, these methods will detect all errors as long as only one entry is corrupted. Third, we propose a low-overhead roll-back approach to correct errors once detected. Finally, we give a high-performance implementation of matrix-matrix multiplication that incorporates these error detection and correction methods. Empirical results demonstrate that these methods work well in practice while imposing an acceptable level of overhead relative to high-performance implementations without fault-tolerance. |
Year | DOI | Venue |
---|---|---|
2001 | 10.1109/DSN.2001.941390 | DSN |
Keywords | Field | DocType |
empirical result,error detection,matrix-matrix multiplication,algorithmic fault-tolerant matrix-matrix multiplication,acceptable level,b. second,high-performance implementation,fault-tolerant high-performance matrix multiplication,low-overhead method,correction method,low-overhead roll-back approach,error detection and correction,fault tolerance,fault tolerant,linear algebra,fault detection,algorithms,multiplication,space technology,high performance computing,error correction,matrix multiplication,propulsion | Fault detection and isolation,Computer science,Algorithm,Error detection and correction,Implementation,Multiplication,Fault tolerance,Matrix multiplication | Conference |
ISBN | Citations | PageRank |
0-7695-1101-5 | 25 | 1.69 |
References | Authors | |
9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
John A. Gunnels | 1 | 717 | 83.20 |
Robert A. van de Geijn | 2 | 2047 | 203.08 |
Daniel S. Katz | 3 | 1496 | 121.04 |
Enrique S. Quintana-Ortí | 4 | 1317 | 150.59 |