Abstract | ||
---|---|---|
Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage devices. This approach increases the performance and scalability of the system, but a fault in one single node can make the whole system fail. In order to avoid this problem, data must be stored using some kind of redundant technique, so that it can be recovered in case of failure. Fault tolerance can be provided in I/O systems by using replication or RAID based schemes. However, most of the current systems apply the same technique of fault tolerant at disk or file system level. This paper1 describes how fault tolerance support can be used by MPI applications based on PVFS version 2 [1], a well-know parallel file system for clusters. This support can be applied to other parallel file systems with many benefits: fault tolerance at file level, flexible definition of new fault tolerance scheme, and dynamic reconfiguration of the fault tolerance policy. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1007/978-3-540-75416-9_25 | PVM/MPI |
Keywords | Field | DocType |
parallel file system,file system,well-know parallel file system,mpi-io parallel file system,new fault tolerance scheme,fault tolerant,fault tolerance policy,fault tolerant file model,o system,file level,fault tolerance support,fault tolerance,clusters | Stuck-at fault,File system,Segmentation fault,Self-certifying File System,Computer science,Software fault tolerance,Fault tolerance,File system fragmentation,Control reconfiguration,Operating system | Conference |
Volume | ISSN | ISBN |
4757 | 0302-9743 | 3-540-75415-6 |
Citations | PageRank | References |
0 | 0.34 | 7 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
A. Calderón | 1 | 8 | 1.54 |
F. García-Carballeira | 2 | 0 | 0.34 |
Florin Isaila | 3 | 234 | 24.01 |
Rainer Keller | 4 | 77 | 8.08 |
Alexander Schulz | 5 | 0 | 0.34 |