Title
MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers
Abstract
MapReduce is the most popular parallel computing framework for big data processing which allows massive scalability across distributed computing environment. Advanced RDMA-based design of Hadoop MapReduce has been proposed that alleviates the performance bottlenecks in default Hadoop MapReduce by leveraging the benefits from RDMA. On the other hand, data processing engine, Spark, provides fast execution of MapReduce applications through in-memory processing. Performance optimization for these contemporary big data processing frameworks on modern High-Performance Computing (HPC) systems is a formidable task because of the numerous configuration possibilities in each of them. In this paper, we propose MR-Advisor, a comprehensive tuning tool for MapReduce. MR-Advisor is generalized to provide performance optimizations for Hadoop, Spark, and RDMA-enhanced Hadoop MapReduce designs over different file systems such as HDFS, Lustre, and Tachyon. Performance evaluations reveal that, with MR-Advisor's suggested values, the job execution performance can be enhanced by a maximum of 58% over the current best-practice values for user-level configuration parameters. To the best of our knowledge, this is the first tool that supports tuning for both Apache Hadoop and Spark, as well as the RDMA and Lustre-based advanced designs.
Year
DOI
Venue
2016
10.1109/SBAC-PAD.2016.33
2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
Keywords
Field
DocType
MR-advisor,comprehensive tuning tool,HPC users,supercomputers,parallel computing framework,Big Data processing,Hadoop MapReduce,Spark,high-performance computing systems,performance optimizations
Big data processing,Data processing,Spark (mathematics),Data-intensive computing,Distributed Computing Environment,Computer science,Parallel computing,Real-time computing,Remote direct memory access,Lustre (mineralogy),Operating system,Scalability
Conference
ISSN
ISBN
Citations 
1550-6533
978-1-5090-6109-9
0
PageRank 
References 
Authors
0.34
7
5
Name
Order
Citations
PageRank
Md. Wasi-ur-Rahman141226.84
Nusrat S. Islam222914.08
Xiaoyi Lu360260.53
Dipti Shankar412010.71
Dhabaleswar K. Panda55366446.70