Title
Progressive Retry for Software Failure Recovery in Message-Passing Applications
Abstract
A method of execution retry for bypassing software faults in message-passing applications is described in this paper. Based on the techniques of checkpointing and message logging, we demonstrate the use of message replaying and message reordering as two mechanisms for achieving localized and fast recovery. The approach gradually increases the rollback distance and the number of affected processes when a previous retry fails, and is therefore named progressive retry. Examples from telecommunications software systems and performance measurements from an application-level implementation are described to illustrate the benefits of the scheme.
Year
DOI
Venue
1997
10.1109/12.628398
IEEE Trans. Computers
Keywords
Field
DocType
affected process,message-passing application,fast recovery,message-passing applications,progressive retry,performance measurement,rollback distance,bypassing software fault,message reordering,software failure recovery,application-level implementation,message replaying,telecommunications software system,software fault tolerance,distributed systems,logging,protocols,message passing,measurement,availability,computational complexity,recovery,software systems,application software,distributed computing,telecommunication systems,telecommunications,sorting,fault tolerance
Computer science,Computer network,Telecommunications control software,Real-time computing,Software system,Application software,Message passing,Parallel computing,Software fault tolerance,Fault tolerance,Software quality,Rollback,Embedded system
Journal
Volume
Issue
ISSN
46
10
0018-9340
Citations 
PageRank 
References 
15
1.15
16
Authors
4
Name
Order
Citations
PageRank
Yi-min Wang13573274.00
Yennun Huang2738106.38
W. Kent Fuchs3151.15
Chandra Kintala419629.54