Abstract | ||
---|---|---|
With the growing complexity of parallel architectures, the probability of system failures grows, too. One approach to cope with this problem is the self-healing, one of the organic computing's self-x features. Self-healing in this context means that computer clusters should detect and handle failures automatically. This paper presents a self-healing mechanism based on checkpointing, so that a cluster remains operative even if some sites or the connections between them fail. The proposed method has been implemented and tested on the Self Distributing Virtual Machine (SDVM). |
Year | Venue | Keywords |
---|---|---|
2004 | GI-Jahrestagung | parallel systems |
Field | DocType | Citations |
Crash,Computer science,Parallel computing,Bulk synchronous parallel,Distributed computing | Conference | 1 |
PageRank | References | Authors |
0.39 | 4 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jan Haase | 1 | 16 | 6.08 |
Frank Eschmann | 2 | 20 | 3.56 |