Title
Deduplication Potential of HPC Applications’ Checkpoints
Abstract
HPC systems contain an increasing number of components, decreasing the mean time between failures. Checkpoint mechanisms help to overcome such failures for long-running applications. A viable solution to remove the resulting pressure from the I/O backends is to deduplicate the checkpoints. However, there is little knowledge about the potential to save I/Os for HPC applications by using deduplication within the checkpointing process. In this paper, we perform a broad study about the deduplication behavior of HPC application checkpointing and its impact on system design.
Year
DOI
Venue
2016
10.1109/CLUSTER.2016.32
2016 IEEE International Conference on Cluster Computing (CLUSTER)
Keywords
Field
DocType
deduplication,checkpointing
Data deduplication,Mean time between failures,Computer science,Parallel computing,Systems design,Application checkpointing,Real-time computing,Redundancy (engineering),Operating system,Scalability,Distributed computing
Conference
ISSN
ISBN
Citations 
1552-5244
978-1-5090-3654-7
1
PageRank 
References 
Authors
0.35
22
6
Name
Order
Citations
PageRank
Jürgen Kaiser1423.99
Ramy Gad2102.35
Tim Süß370.93
Federico Padua4101.29
Lars Nagel57613.58
André Brinkmann640334.79