Abstract | ||
---|---|---|
In grid computing, the realization of an enviable fault tolerance ability is linked with the proper utilization of resources and scheduling of jobs. The literature offers two solutions to these two challenging tasks, viz, checkpointing and replication. A checkpointing strategy is being proposed that uses the median of failure intervals of the resources in deciding the checkpoint intervals for the given jobs. The strategy shows improved system throughput, job losses and job execution times while eliminating unnecessary checkpoints. |
Year | DOI | Venue |
---|---|---|
2012 | 10.7148/2012-0483-0489 | PROCEEDINGS 26TH EUROPEAN CONFERENCE ON MODELLING AND SIMULATION ECMS 2012 |
Keywords | Field | DocType |
Fault tolerance, Checkpointing, Distributed systems | Job losses,Grid computing,Computer science,Scheduling (computing),Fault tolerance,Throughput,Reliability engineering | Conference |
Citations | PageRank | References |
2 | 0.35 | 12 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Suleman Khan | 1 | 55 | 4.02 |
Khizar Hayat | 2 | 248 | 19.71 |
Sajjad Ahmad Madani | 3 | 409 | 26.21 |
Samee Ullah Khan | 4 | 1605 | 81.01 |
Joanna Kolodziej | 5 | 920 | 55.57 |