Title
Bounded-time recovery for distributed real-time systems
Abstract
This paper explores bounded-time recovery (BTR), a new approach to making cyber-physical systems robust to crash faults. Rather than trying to mask the symptoms of a fault with massive redundancy, BTR detects faults at runtime and enables the system to recover from them – e.g., by transferring tasks to other nodes that are still working correctly. When a fault does occur, there is a brief period of instability during which the system can produce incorrect outputs. However, many cyber-physical systems have physical properties – such as inertia or thermal capacity – that limit the rate at which the state of the system can change; thus, a very brief outage is often acceptable, as long as its duration can be bounded, to perhaps a few milliseconds.BTR has some interesting properties: for instance, it has a much lower overhead than Paxos, and, unlike Paxos, it can take useful actions even when the system partitions or a majority of the nodes fails. However, it also poses a very unusual scheduling problem that involves creating sets of interrelated schedules for different failure modes. We present a scheduling algorithm called Cascade that can quickly find suitable schedules. Using a prototype implementation, we show that Cascade scales far better than a baseline algorithm and reduces the scheduling time from hours to a few seconds, without sacrificing quality.
Year
DOI
Venue
2020
10.1109/RTAS48715.2020.00-13
2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)
Keywords
DocType
ISSN
design space exploration for RT for latency-sensitive systems,scheduling and resource allocation for RT or latency-sensitive systems,system-level optimization and co-design techniques for RT or latency-sensitive systems
Conference
1545-3421
ISBN
Citations 
PageRank 
978-1-7281-5500-5
0
0.34
References 
Authors
24
5
Name
Order
Citations
PageRank
Neeraj Gandhi111.37
Edo Roth252.74
Robert Gifford331.75
Linh Thi Xuan Phan495.54
Andreas Haeberlen5475.75