Abstract | ||
---|---|---|
We present a transparent middleware for fault tolerance based on RADIC, Redundant Array of Distributed Independent Controllers, a transparent and scalable fault tolerant architecture for parallel applications. It is designed at socket level and makes a secure tunnel connection able to keep the tcp sessions established by the application in spite of node failures. It is located at user level and is independent of the message-passing communication library being used. The protection gets through uncoordinated checkpoints and log message and the recovery are done in a automatic way so in case of node failures there is no need of intervention of the administrator. We have tested our fault tolerance system by executing a master-worker (M/W) and SPMD applications that follow different communication patterns. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/ISPA.2012.121 | ISPA |
Keywords | Field | DocType |
transparent middleware,node failure,message-passing communication library,fault tolerance system,different communication pattern,socket level,transparent fault tolerance solution,scalable fault tolerant architecture,independent controllers,user level,fault tolerance,high availability,middleware,mpi,message passing,parallel computing,distributed processing,computer architecture | Middleware,SPMD,Fault tolerant architecture,Computer science,Software fault tolerance,Real-time computing,Fault tolerance,High availability,Message passing,Distributed computing,Scalability | Conference |
Citations | PageRank | References |
0 | 0.34 | 2 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Marcela Castro | 1 | 1 | 1.70 |
Dolores Rexachs | 2 | 195 | 43.20 |
Emilio Luque | 3 | 1097 | 176.18 |