Title
HPC-Reuse: Efficient Process Creation for Running MPI and Hadoop MapReduce on Supercomputers
Abstract
Hadoop and Spark analytics are used widely for large-scale data processing on commodity clusters. It is better choice to run them on supercomputers in aspects of productivity and maturity rather than developing new frameworks from scratch. YARN, a key component of Hadoop, is responsible for resource management. YARN adopts dynamic management for job execution and scheduling. We identify three Ds (3D) dynamic characteristics from YARN-like management: on-Demand (processes created during job execution), Diverse job, and Detailed (fine-grained allocation). The dynamic management does not fit into typical resource managers on supercomputers, for example PBS, that are identified having three Ss (3S) static characteristics: Stationary (no newly created process during execution), Single job, and Shallow (coarse-grained allocation). In this paper, we propose HPC-Reuse located between YARN-like and PBS-like resource managers in order to provide better support of dynamic management. HPC-Reuse helps avoid process creation, such as MPI-Spawn, and enable MPI communication over Hadoop processes. Our experimental results show that HPC-Reuse can reduce execution time of iterative PageRank by 26%.
Year
DOI
Venue
2016
10.1109/CCGrid.2016.72
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
Keywords
Field
DocType
iterative PageRank,Hadoop processes,MPI communication,PBS-like resource managers,static characteristics,diverse job,YARN-like management,dynamic characteristics,scheduling,job execution,dynamic management,commodity clusters,large-scale data processing,Spark analytics,supercomputers,Hadoop MapReduce,process creation,HPC-reuse
Resource management,PageRank,Data processing,Spark (mathematics),Yarn,Reuse,Computer science,Scheduling (computing),Real-time computing,Analytics,Operating system,Distributed computing
Conference
ISSN
ISBN
Citations 
2376-4414
978-1-5090-2454-4
2
PageRank 
References 
Authors
0.38
8
2
Name
Order
Citations
PageRank
Thanh-Chung Dao130.78
Shigeru Chiba21281140.78