Title
Design and Implementation of Effective Checkpointing for Multithreaded Applications on Future Clouds
Abstract
Multithreaded applications are common in high performance cloud computing systems, able to take advantage of elastic resource availability and cost fluctuation inherent to the systems. When applications involve many threads over more cores leased from the RaaS (Resource-as-a-Service) cloud under spot instance pricing for faster execution, resource unavailability are more likely to occur, undercutting execution performance gains potentially offered by those more cores. As a result, checkpointing is required to lower the adverse impact of resource unavailability on execution performance of such multithreaded applications. Given checkpointing often incurs expensive I/O to remote storage, this work presents design and implementation of our adaptive incremental checkpointing (AIC) for multithreaded applications on the RaaS clouds. AIC utilizes the idle cores for adaptive delta compression and remote checkpointing, significantly reducing the expected job turnaround time and the aggregated file size at remote storage. To ensure high compatibility and portability for AIC, we exploit techniques to avoid using kernel-specific data structures. AIC has been evaluated using PARSEC benchmarks on our established testbed, which resembles a multicore system acquired from the RaaS cloud. The results show that AIC noticeably reduces the expected turnaround time (by up to 37%) and the aggregated file size (by up to 8.3脳) when compared to a recent multi-level checkpointing scheme with fixed checkpoint intervals.
Year
DOI
Venue
2013
10.1109/CLOUD.2013.57
IEEE CLOUD
Keywords
Field
DocType
checkpointing,cloud computing,multi-threading,multiprocessing systems,resource allocation,AIC,PARSEC benchmarks,RaaS cloud,adaptive delta compression,adaptive incremental checkpointing,checkpoint intervals,cost fluctuation,elastic resource availability,execution performance gains,high performance cloud computing systems,multicore system,multilevel checkpointing scheme,multithreaded applications,remote checkpointing,resource-as-a-service cloud,spot instance pricing,Adaptive checkpointing,RaaS clouds,delta compression,fault tolerance,incremental checkpointing,networked multicore systems
Multithreading,Computer science,Real-time computing,Unavailability,Resource allocation,Software portability,Turnaround time,Multi-core processor,Delta encoding,Cloud computing,Distributed computing
Conference
ISSN
Citations 
PageRank 
2159-6182
2
0.36
References 
Authors
14
2
Name
Order
Citations
PageRank
Itthichok Jangjaimon1211.71
Nian-Feng Tzeng285694.11