Title
Comparing the effects of intermittent and transient hardware faults on programs
Abstract
The trends of shrinking device geometries, lower voltages and higher frequencies in modern processors are expected to increase the rate of intermittent faults. This requires the design of software that are resilient to intermittent faults. There has been substantial research on software systems that are resilient to transient faults. However, it is unclear whether the impact of intermittent faults on programs is similar to that of transient faults. This is important for deciding if we need novel techniques for tolerating intermittent faults in software. In this study, we attempt to answer this question by comparing the effects of intermittent and transient hardware faults on programs through fault-injection experiments performed in a micro-architectural simulator for a simple five-stage pipelined processor. We also investigate whether the differences (if any) vary with the length (i.e., duration in cycles) of the fault and with the micro-architectural unit in which the fault originates. The result show that intermittent faults' impact on programs are significantly different from those of transient faults, and that the difference depends both on the length of the fault and the fault's origin. Therefore, existing software techniques for ensuring resilience from transient faults may not be sufficient for intermittent faults, and new techniques are needed.
Year
DOI
Venue
2011
10.1109/DSNW.2011.5958835
DSN Workshops
Keywords
Field
DocType
micro-architectural-level fault injection,intermittent fault,micro-architectural unit,micro-architectural simulator,fault injection experiment,higher frequency,transient fault,existing software technique,program diagnostics,software fault tolerance,intermittent hardware fault,microarchitectural simulator,five-stage pipelined processor,transient hardware fault,device geometries,fault-injection experiment,software design,pipeline processing,software system,software systems,indexing terms,benchmark testing
Software design,Computer science,Voltage,Software fault tolerance,Software system,Real-time computing,Intermittent fault,Software,Power-system protection,Computer hardware,Benchmark (computing)
Conference
ISBN
Citations 
PageRank 
978-1-4577-0373-7
8
0.63
References 
Authors
15
4
Name
Order
Citations
PageRank
Wei, Jiesheng1171.86
Layali Rashid2393.41
Karthik Pattabiraman3103055.17
Sathish Gopalakrishnan442633.10