Title
Secure Live Migration Of Parallel Applications Using Container-Based Virtual Machines
Abstract
A parallel application will terminate when a computational node fails. As the number of components in supercomputers increase and applications scale to use these systems, the mean time to failure decreases. Traditional fault tolerance approaches, such as checkpointing, are failing to scale. An alternative approach we explore in this paper is the use of VM-based live migration to move a process from a failing node to a healthy one to reduce the fault rate experienced by an application. We investigate the use of a virtualisation environment based on OpenVZ to perform live migrations of virtual machines on which multi-processor parallel applications are running. We explore the correctness, performance, security, and reliability of this approach along with the overhead of using OS-level virtualised systems for fault recovery. Our results confirm that it is possible to efficiently migrate virtual containers without affecting the correctness or completion of parallel applications.
Year
DOI
Venue
2012
10.1504/IJSSC.2012.045562
INTERNATIONAL JOURNAL OF SPACE-BASED AND SITUATED COMPUTING
Keywords
Field
DocType
reliability, fault tolerance, network architecture and design, high performance computing, network security
Virtualization,Mean time between failures,Virtual machine,Supercomputer,Live migration,Computer science,Network security,Correctness,Fault tolerance,Embedded system,Distributed computing
Journal
Volume
Issue
ISSN
2
1
2044-4893
Citations 
PageRank 
References 
2
0.37
1
Authors
3
Name
Order
Citations
PageRank
Thomas J. Hacker133832.29
Fabian Romero2352.14
Jeremiah J. Nielsen320.37