Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice - Citegraph

Paper Info

Title
Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice

Abstract
Abstract: In this paper, we extend the theory and practice regarding algorithmic fault-tolerant matrix-matrix multiplication, C = AB, in a number of ways. First, we propose low-overhead methods for detecting errors introduced not only in C but also in A and/or B. Second, we show that, theoretically, these methods will detect all errors as long as only one entry is corrupted. Third, we propose a low-overhead roll-back approach to correct errors once detected. Finally, we give a high-performance implementation of matrix-matrix multiplication that incorporates these error detection and correction methods. Empirical results demonstrate that these methods work well in practice while imposing an acceptable level of overhead relative to high-performance implementations without fault-tolerance.

Year	DOI	Venue
2001	10.1109/DSN.2001.941390	DSN
Keywords	Field	DocType
empirical result,error detection,matrix-matrix multiplication,algorithmic fault-tolerant matrix-matrix multiplication,acceptable level,b. second,high-performance implementation,fault-tolerant high-performance matrix multiplication,low-overhead method,correction method,low-overhead roll-back approach,error detection and correction,fault tolerance,fault tolerant,linear algebra,fault detection,algorithms,multiplication,space technology,high performance computing,error correction,matrix multiplication,propulsion	Fault detection and isolation,Computer science,Algorithm,Error detection and correction,Implementation,Multiplication,Fault tolerance,Matrix multiplication	Conference
ISBN	Citations	PageRank
0-7695-1101-5	25	1.69
References	Authors
9	4

Authors (4 rows)

Cited by (25 rows)

References (9 rows)

Name	Order	Citations	PageRank
John A. Gunnels	1	717	83.20
Robert A. van de Geijn	2	2047	203.08
Daniel S. Katz	3	1496	121.04
Enrique S. Quintana-Ortí	4	1317	150.59

1