Cooperative checkpointing theory - Citegraph

Paper Info

Title
Cooperative checkpointing theory

Abstract
Cooperative checkpointing uses global knowledge of the state and health of the machine to improve performance and reliability by dynamically deciding when to skip checkpoint requests made by applications. Using results from cooperative checkpointing theory, this paper proves that periodic checkpointing is not expected to be competitive with the offline optimal. By leveraging probabilistic information about the future, cooperative checkpointing gives flexible algorithms that are optimally competitive. The results prove that simulating periodic checkpointing, by performing only every dth checkpoint, is not competitive with the offline optimal in the worst case; a simple modification gives a provably competitive algorithm. Calculations using failure traces from a prototype of IBM's Blue Gene/L show an application using cooperative checkpointing may make progress 4 times faster than one using periodic checkpointing, under realistic conditions. We contribute an approach to providing large-scale system reliability through cooperative checkpointing and techniques for analyzing the approach.

Year	DOI	Venue
2006	10.1109/IPDPS.2006.1639368	International Parallel and Distributed Processing Symposium/International Parallel Processing Symposium
Keywords	Field	DocType
blue gene,cooperative checkpointing theory,large-scale system reliability,failure trace,periodic checkpointing,dth checkpoint,checkpoint request,offline optimal,cooperative checkpointing,provably competitive algorithm,prototypes,interference,computer science,cost function	IBM,Computer science,Parallel computing,Blue gene,Competitive algorithm,Probabilistic logic,Periodic graph (geometry),Distributed computing	Conference
ISBN	Citations	PageRank
1-4244-0054-6	12	1.26
References	Authors
14	3

Authors (3 rows)

Cited by (12 rows)

References (14 rows)

Name	Order	Citations	PageRank
Adam J. Oliner	1	715	51.10
Larry Rudolph	2	168	15.54
Ramendra K. Sahoo	3	633	56.73

1