Title
Low-cost prediction-based fault protection strategy.
Abstract
Increasing failures from transient faults necessitates the cost-efficient protection mechanism that will be always activated. Thus, we propose a novel prediction-based transient fault protection strategy as a low-cost software-only technique. Instead of re-executing expensive computations for validation, an output prediction is used to cheaply determine an approximate value for a sequence of computation. When actual computation and prediction agree within a predefined acceptable range, the computation is assumed fault-free, and expensive re-computation can be skipped. With our approach, a significant reduction in dynamic instruction counts is possible. Missed faults may occur, but their occurrences can be explicitly kept to a small amount with a proper acceptable range. For evaluation, we build an automatic compilation system, called RSkip, that transforms a program into a resilient executable with the prediction-based protection scheme. Prior instruction replication work shows 2.33x execution time compared to the unreliable execution over nine compute-intensive benchmarks. With a control for the loss in protection rate, RSkip can reduce the protection overhead to 1.27x by skipping redundant computation in our target loops at a rate of 81.10
Year
DOI
Venue
2020
10.1145/3368826.3377920
CGO
Keywords
Field
DocType
Reliability, Approximation computing, Redundancy
Computer science,Cost prediction,Real-time computing,Reliability engineering
Conference
ISSN
ISBN
Citations 
2164-2397
978-1-4503-7047-9
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Sunghyun Park115410.83
Shikai Li200.34
Ze Zhang311.37
Scott Mahlke44811312.08