Title
FlipIt: An LLVM Based Fault Injector for HPC.
Abstract
High performance computing (HPC) is increasingly subjected to faulty computations. The frequency of silent data corruptions (SDCs) in particular is expected to increase in emerging machines requiring HPC applications to handle SDCs. In this paper we, propose a robust fault injector structured through an LLVM compiler pass that allows simulation of SDCs in various applications. Although fault injection locations are enumerated at compile time, their activation is purely at runtime and based on a user-provided fault distribution. The robustness of our fault injector is in the ability to augment the runtime injection logic on a per application basis. This allows tighter control on the spacial, temporal, and probability of injected faults. The usability, scalability, and robustness of our fault injection is demonstrated with injecting faults into an algebraic multigird solver.
Year
DOI
Venue
2014
10.1007/978-3-319-14325-5_47
Lecture Notes in Computer Science
Field
DocType
Volume
Supercomputer,Soft error,Computer science,Compile time,Parallel computing,Robustness (computer science),Compiler,Solver,Fault injection,Distributed computing,Scalability,Embedded system
Conference
8805
ISSN
Citations 
PageRank 
0302-9743
15
0.65
References 
Authors
13
3
Name
Order
Citations
PageRank
Jon Calhoun1474.75
Luke Olson223521.93
M. Snir33984520.82