Title
Epipe: A low-cost fault-tolerance technique considering WCET constraints
Abstract
Transient faults will soon become a critical reliability concern for processors used in mainstream computing. As the mainstream commodity market accepts only low-cost solutions for transient-fault tolerance, traditional high-end solutions are not acceptable due to their prohibitive costs. This paper presents Epipe, a hybrid software/hardware solution that provides sufficient fault coverage with affordable overhead for mainstream commodity systems. Given a program, Epipe identifies its vulnerable instructions (VIs), i.e., the ones that may cause silent data corruptions (SDCs) by compile-time analysis, and selects a subset of VIs to protect considering worst-case execution time (WCET) constraints in the fault-free execution. During program execution on a modified superscalar processor which incurs minimal hardware overhead, Epipe relies on selective instruction replication to handle the VI-induced SDCs and an existing exception detector to tolerate the remaining faults that manifest as system exceptions. Our experimental results show that Epipe provides sufficient fault coverage under some tight WCET constraints and increasingly higher coverage under more relaxed WCET constraints. As the WCET allowance increases from 5% to 15% and then to 25%, the coverage increases from 70.8% to 80% and then to 86.6% averagely. Unlike existing hybrid solutions, Epipe is the first to respect WCET constraints, which are an important concern for real-time systems.
Year
DOI
Venue
2013
10.1016/j.sysarc.2013.06.003
Journal of Systems Architecture - Embedded Systems Design
Keywords
Field
DocType
mainstream commodity market,wcet constraint,higher coverage,coverage increase,mainstream computing,sufficient fault coverage,tight wcet constraint,wcet allowance increase,fault-free execution,low-cost fault-tolerance technique,mainstream commodity system,fault tolerance
Fault coverage,Computer science,Parallel computing,Real-time computing,Fault tolerance,Software,Execution time,Superscalar
Journal
Volume
Issue
ISSN
59
10
1383-7621
Citations 
PageRank 
References 
4
0.39
37
Authors
6
Name
Order
Citations
PageRank
Jianli Li1133.23
Jingling Xue21627124.20
Xinwei Xie3793.16
Qing Wan4295.03
Qingping Tan513523.97
Lanfang Tan6123.26