Title
A low-level software-based fault tolerance approach to detect SEUs in GPUs' register files.
Abstract
This paper presents an approach based on software-based fault tolerance techniques applied at low abstraction level to detect SEU faults in register files of Graphics Processing Units. SEU faults have a major influence on such architectures, especially affecting register files and cache memory. In order to harden the system's register files, software-based techniques are presented and tuned to detect faults in vector, address, and predicate registers. A fault injection campaign at Register Transfer Level is performed on the register files using a G80 general purpose graphics processing unit running four case-study applications. Results show reduction in errors up to 100% and overhead costs in execution time up to 1.78 times the original values. (C) 2017 Elsevier Ltd. All rights reserved.
Year
DOI
Venue
2017
10.1016/j.microrel.2017.07.035
MICROELECTRONICS RELIABILITY
Keywords
Field
DocType
Graphic processing unit,Fault tolerance,Single Event Upset
CPU cache,Computer science,Software fault tolerance,Stack register,Fault tolerance,Software,Graphics processing unit,Processor register,Fault injection,Embedded system
Journal
Volume
ISSN
Citations 
76
0026-2714
7
PageRank 
References 
Authors
0.58
5
4
Name
Order
Citations
PageRank
Marcio Gonçalves1101.03
Mateus Saquetti2133.10
Fernanda Lima Kastensmidt355461.82
Jose Rodrigo Azambuja4133.13