Title
Optimization of checkpointing/recovery strategy in cloud computing with adaptive storage management.
Abstract
Cloud Computing is a type of distributed system that is usually based on the services offered to the user based on SLA contract. In this case, the implementation of a fault-tolerant system that ensures the reliability and the services continuity becomes a major requirement. In this paper, we propose a fault tolerance strategy based on checkpointing and replication. Our approach uses a smart checkpoint infrastructure for cloud computing tasks. The checkpoints are stored in alternative already paid VMs. This allows resuming a task execution faster and cheaper after a node crash. Since checkpoints are distributed and replicated, our approach increases also the system reliability. The experimental results show the effectiveness of the proposed strategy in term of energy consumption, SLA (System Level Aggregation) violation, and reliability.
Year
DOI
Venue
2018
10.1002/cpe.4906
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
Keywords
Field
DocType
availability,checkpointing, replication, SLA,cloud computing,reliability,slots
Computer science,Storage management,Cloud computing,Distributed computing
Journal
Volume
Issue
ISSN
30
SP24
1532-0626
Citations 
PageRank 
References 
2
0.37
14
Authors
2
Name
Order
Citations
PageRank
Bakhta Meroufel1122.60
Ghalem Belalem210630.12