Abstract | ||
---|---|---|
Checkpointing techniques have widely been studied in the literature as a way to recover from failures in sequential, distributed and parallel environments. However, most of the checkpointing mechanisms proposed so far focus only on the recovery of the application data. If the application performs some I/O operations to disk files, such schemes may not work correctly, as they do not provide rollback-recovery for the file contents. In this paper, we present a distributed checkpointing mechanism for a Parallel File System that can be integrated with any of the previous application checkpointing algorithms. Three different file checkpointing schemes will be presented, tested in that mechanism and discussed in detail. The distributed mechanism proposed was integrated in PIOUS - a public-domain parallel file system developed for the PVM distributed computing environment. |
Year | Venue | Keywords |
---|---|---|
2000 | PVM/MPI | checkpointing mechanism,previous application,o operation,checkpointing technique,parallel environment,parallel file system,different file,public-domain parallel file system,disk file,application data,file content,distributed computing environment,public domain,fault tolerant |
Field | DocType | Volume |
File system,Virtual machine,Distributed Computing Environment,Self-certifying File System,Computer science,Parallel computing,Application checkpointing,Fault tolerance,Parallel I/O,Distributed computing | Conference | 1908 |
ISSN | ISBN | Citations |
0302-9743 | 3-540-41010-4 | 0 |
PageRank | References | Authors |
0.34 | 9 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vítor N. Távora | 1 | 0 | 0.34 |
Luís Moura Silva | 2 | 312 | 36.22 |
João Gabriel Silva | 3 | 618 | 63.55 |