Title
Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX
Abstract
While there exist many kernel and user level libraries/systems which support checkpointing working processes and resuming their operations, it is still very difficult to provide an easy-to-use tool to assist checkpointing parallel applications. In this work, we aim at the development of an easy-to-use user-guided library to support checkpointing parallel MPI applications to be executed within the CLUSTERIX environment i.e. a collection of distributed HPC clusters. We propose a programmer-assisted approach with process state packing and unpacking at the code level for SPMD HPC applications. Although the library is in its early stage of development we present checkpoint/restart times and application execution (interrupted by checkpointing) times for the proposed approach compared to the same application linked with the ckpt user level library.
Year
DOI
Venue
2004
10.1109/PARELEC.2004.72
PARELEC
Keywords
Field
DocType
mpi applications,parallel application,parallel mpi application,spmd hpc application,ckpt user level library,towards easy-to-use checkpointing,hpc cluster,application execution,easy-to-use user-guided library,user level library,easy-to-use tool,code level,parallel programming,message passing
Kernel (linear algebra),SPMD,Process state,Computer science,Parallel computing,Message passing,Unpacking,Operating system
Conference
ISBN
Citations 
PageRank 
0-7695-2080-4
1
0.37
References 
Authors
5
5
Name
Order
Citations
PageRank
Pawel Czarnul112121.11
Arkadiusz Urbaniak240.96
Marcin Fraczak310.37
Maciej Dyczkowski410.37
Bartlomiej Balcerek510.37