Title
Real-time scheduling under fault bursts with multiple recovery strategy
Abstract
In this paper, we consider the feasibility problem of a set of real-time jobs which may be subject to a fault burst during execution. A fault burst represents a time interval during which multiple jobs may incur faults; hence multiple recoveries may be needed. We show that determining the feasibility of a real-time system, which may be subject to a fault burst that may last at most Δ time units, is an NP-Hard problem even when the exact position of the fault burst is known a priori. However, in a practical system, the fault burst may occur at any arbitrary and unpredictable time. We develop feasibility analysis by assuming multiple recovery strategy where, in addition to the job at the end of which the fault is detected, all preempted tasks are also re-executed. We formally characterize the overhead that a scheduler incurs due to a fault burst and present a generic recovery strategy, called Δ-idling, that is shown to minimize the worst-case overhead for any priority-driven scheduling algorithm. Next, we analyze periodic task systems. We show that the preemptive EDF policy, when coupled with Δ-idling, provides the highest possible utilization bound ½ (1 - Δ over Pmin), where Pmin is the smallest task period. We also present an empirical evaluation of the EDF policy with Δ-idling over synthetically generated task sets, and show that it offers a clear improvement over the naive EDF policy that triggers the recovery tasks as soon as an error is detected.
Year
DOI
Venue
2014
10.1109/RTAS.2014.6925991
Real-Time and Embedded Technology and Applications Symposium
Keywords
Field
DocType
computational complexity,fault diagnosis,processor scheduling,real-time systems,system recovery,Δ-idling,NP-Hard problem,fault burst,fault detection,generic recovery strategy,multiple recovery strategy,naive EDF policy,periodic task systems,priority-driven scheduling algorithm,real-time jobs,real-time scheduling,real-time system,scheduler overhead,utilization bound,worst-case overhead
Metrical task system,Fixed-priority pre-emptive scheduling,Fault detection and isolation,Scheduling (computing),Computer science,A priori and a posteriori,Real-time computing,Schedule,Unit of time,Periodic graph (geometry),Distributed computing
Conference
ISSN
Citations 
PageRank 
1080-1812
9
0.48
References 
Authors
18
3
Name
Order
Citations
PageRank
Mohammad A. Haque19910.07
Hakan Aydin2121861.97
Da-Kai Zhu3140566.97