Title
Transparent Fault Tolerance Solution at Socket Level Based on RADIC
Abstract
We present a transparent middleware for fault tolerance based on RADIC, Redundant Array of Distributed Independent Controllers, a transparent and scalable fault tolerant architecture for parallel applications. It is designed at socket level and makes a secure tunnel connection able to keep the tcp sessions established by the application in spite of node failures. It is located at user level and is independent of the message-passing communication library being used. The protection gets through uncoordinated checkpoints and log message and the recovery are done in a automatic way so in case of node failures there is no need of intervention of the administrator. We have tested our fault tolerance system by executing a master-worker (M/W) and SPMD applications that follow different communication patterns.
Year
DOI
Venue
2012
10.1109/ISPA.2012.121
ISPA
Keywords
Field
DocType
transparent middleware,node failure,message-passing communication library,fault tolerance system,different communication pattern,socket level,transparent fault tolerance solution,scalable fault tolerant architecture,independent controllers,user level,fault tolerance,high availability,middleware,mpi,message passing,parallel computing,distributed processing,computer architecture
Middleware,SPMD,Fault tolerant architecture,Computer science,Software fault tolerance,Real-time computing,Fault tolerance,High availability,Message passing,Distributed computing,Scalability
Conference
Citations 
PageRank 
References 
0
0.34
2
Authors
3
Name
Order
Citations
PageRank
Marcela Castro111.70
Dolores Rexachs219543.20
Emilio Luque31097176.18