Title
A Framework for Node-Level Fault Tolerance in Distributed Real-Time Systems
Abstract
This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed real-time systems. The objective of NLFT is to mask errors at the node level in order to reduce the probability of node failures and thereby improve system dependability. We describe an approach called light-weight NLFT where transient faults are masked locally in the nodes by time-redundant execution of application tasks. The advantages of light-weight NLFT is demonstrated by a reliability analysis of an example brake-by-wire architecture. The results show that the use of light-weight NLFT may provide 55% higher reliability after one year and almost 60% higher MTTF, compared to using fail-silent nodes.
Year
DOI
Venue
2005
10.1109/DSN.2005.7
DSN
Keywords
Field
DocType
node-level fault tolerance,application task,node level,node failure,fail-silent node,reliability analysis,higher mttf,example brake-by-wire architecture,light-weight nlft,real-time systems,higher reliability,mask error,distributed processing,error handling,real time systems,fault tolerant,reliability,probability
Mean time between failures,Dependability,Computer science,Real-time computing,Fault tolerance,Reliability engineering,Distributed computing
Conference
ISSN
ISBN
Citations 
1530-0889
0-7695-2282-3
8
PageRank 
References 
Authors
0.79
7
3
Name
Order
Citations
PageRank
Joakim Aidemark11369.49
peter folkesson280.79
Johan Karlsson330525.29