Title
Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem.
Abstract
To alleviate the cost burden, efficiently sharing HPC cluster resources to end users through virtualization is becoming more and more attractive. In this context, some critical HPC resources among Virtual Machines, such as Single Root I/O Virtualization SR-IOV enabled Virtual Functions VFs and Inter-VM Shared memory IVShmem devices, need to be enabled and isolated to support efficiently running multiple concurrent MPI jobs on HPC clouds. However, original Slurm is not able to supervise VMs and associated critical resources, such as VFs and IVShmem. This paper proposes a novel framework, Slurm-V, which extends Slurm with virtualization-oriented capabilities such as job submission to dynamically created VMs with isolated SR-IOV and IVShmem resources. We propose several alternative designs for Slurm-V: Task-based design, SPANK plugin-based design, and SPANK plugin over OpenStack-based design, to manage and isolate IVShmem and SR-IOV resources for running MPI jobs. We evaluate these designs from aspects of startup performance, scalability, and application performance in different scenarios. The evaluation results show that VM startup time can be reduced by upï¾źto 2.64X through snapshot scheme in Slurm SPANK plugin. Our proposed Slurm-V framework shows good scalability and the ability of efficiently running concurrent MPI jobs on SR-IOV enabled InfiniBand clusters. To the best of our knowledge, Slurm-V is the first attempt to extend Slurm for the support of running concurrent MPI jobs with isolated SR-IOV and IVShmem resources. The capabilities of Slurm-V can be used to build efficient HPC clouds.
Year
DOI
Venue
2016
10.1007/978-3-319-43659-3_26
Euro-Par
Field
DocType
Volume
Virtualization,Virtual machine,InfiniBand,Shared memory,Computer science,Parallel computing,Virtual function,Plug-in,Operating system,Cloud computing,Scalability,Distributed computing
Conference
9833
ISSN
Citations 
PageRank 
0302-9743
3
0.43
References 
Authors
9
4
Name
Order
Citations
PageRank
Jie Zhang1927.52
Xiaoyi Lu260260.53
Sourav Chakraborty338149.27
Dhabaleswar K. Panda45366446.70