Title
Trade-offs in transient fault recovery schemes for redundant multithreaded processors
Abstract
CMOS downscaling trends, manifested in the use of smaller transistor feature sizes and lower supply voltages, make microprocessors more and more vulnerable to transient errors with each new technology generation. One architectural approach to detecting and recovering from such errors is to execute two copies of the same program and then compare the results. While comparing only the store instructions is sufficient for error detection, register values also have to be compared to support fault recovery. In this paper, we propose novel checkpoint-assisted mechanisms for efficient fault recovery that dramatically reduce the number of register values to be compared for detecting soft errors and perform comprehensive investigation of these and other existing recovery schemes from the standpoint of performance, power and design complexity.
Year
DOI
Venue
2006
10.1007/11945918_18
HiPC
Keywords
Field
DocType
design complexity,architectural approach,register value,transient fault recovery scheme,fault recovery,existing recovery scheme,error detection,cmos downscaling trend,lower supply voltage,efficient fault recovery,comprehensive investigation,redundant multithreaded processor,soft error
Soft error,Computer science,CPU cache,Microprocessor,Parallel computing,Register file,CMOS,Error detection and correction,Redundancy (engineering),Fault tolerance,Computer engineering,Embedded system
Conference
Volume
ISSN
ISBN
4297
0302-9743
3-540-68039-X
Citations 
PageRank 
References 
6
0.43
17
Authors
5
Name
Order
Citations
PageRank
Joseph J. Sharkey11248.44
Nayef Abu-Ghazeleh260.43
Dmitry Ponomarev389356.45
Kanad Ghose41220113.50
Aneesh Aggarwal520216.91