Title
Characterization of the Impact of Soft Errors on Iterative Methods.
Abstract
Soft errors caused by transient bit flips have the potential to significantly impact an application's behavior. This has motivated the design of an array of techniques to detect, isolate, and correct soft errors using microarchitectural, architectural, compilation-based, or application-level techniques to minimize their impact on the executing application. The first step toward the design of good error detection/correction techniques involves an understanding of an application's vulnerability to soft errors. In this paper, we present the first comprehensive characterization of the impact of soft errors on the convergence characteristics of six iterative methods using application-level fault injection. In particular, we consider the use of iterative methods to incrementally solve a linear system of equations, which constitute the core kernel in many scientific applications. We analyze the impact of soft errors in terms of the type of error (single-vs multi-bit), the distribution and location of bits affected, the data structure and statement impacted, and variation with time. In addition to understanding the vulnerability of iterative solvers to soft errors, this characterization can aid the design of fault injection campaigns that ensure systematic coverage.
Year
DOI
Venue
2018
10.1109/HiPC.2018.00031
HiPC
Keywords
Field
DocType
Iterative methods,Sparse matrices,Data structures,Tools,Convergence,Mathematical model,Space exploration
Convergence (routing),Kernel (linear algebra),Data structure,System of linear equations,Computer science,Iterative method,Parallel computing,Algorithm,Error detection and correction,Sparse matrix,Fault injection
Conference
ISSN
ISBN
Citations 
1094-7256
978-1-5386-8386-6
1
PageRank 
References 
Authors
0.36
0
6
Name
Order
Citations
PageRank
Burcu Ozcelik Mutlu141.88
Gokcen Kestor214814.25
Joseph Manzano3378.63
Osman Unsal416414.33
Samrat Chatterjee521.72
Sriram Krishnamoorthy6120286.68