Title
GiFT: Automating FTPA Implementation for MPI Programs
Abstract
Fault tolerance is a critical issue in the arena of large-scale computing. The fault-tolerant parallel algorithm (FTPA) is an application-level technique for tolerating hardware failures. FTPA achieves fast failure recovery making use of parallel recomputing. However, it complicates the coding of the application program. This paper uses compiler technology to automate the design of FTPA, and introduces the implementation of a tool called GiFT (Get it Fault-Tolerant). GiFT utilizes the extended data-flow analysis to choose the state needed by failure recovery, exploits the parallel recomputing time model to compute the optimal number of recomputing processes, and uses parallelization technologies to generate parallel recomputing codes. The experimental results show that original MPI programs can be transformed into the FTPA counterparts by GiFT correctly, and the performance of GiFT-generated FTPA programs is comparable to the performance of hand-modified FTPA programs.
Year
DOI
Venue
2008
10.1109/ICPADS.2008.89
ICPADS
Keywords
Field
DocType
parallel recomputing,hand-modified ftpa program,parallel recomputing time model,mpi programs,failure recovery,fault-tolerant parallel algorithm,recomputing process,gift-generated ftpa program,automating ftpa implementation,fast failure recovery,ftpa counterpart,parallel recomputing code,fault tolerant,computational modeling,data flow analysis,parallel algorithm,message passing,parallel algorithms,algorithms,fault tolerance
Computer science,Parallel algorithm,Parallel computing,Data-flow analysis,Real-time computing,Coding (social sciences),Compiler,Exploit,Fault tolerance,Time model,Message passing,Distributed computing
Conference
Citations 
PageRank 
References 
0
0.34
10
Authors
5
Name
Order
Citations
PageRank
Hongyi Fu16812.50
Yunfei Du27214.62
Panfeng Wang3346.12
Jia Jia4364.01
Xuejun Yang567873.26