Title
Evaluating SPLASH-2 Applications Using MapReduce
Abstract
MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. Due to the mentioned features of MapReduce above, researchers have also explored the use of MapReduce on other application domains, such as machine learning, textual retrieval and statistical translation, among others. In this paper, we study the feasibility of running typical supercomputing applications using the MapReduce framework. We port two applications (Water Spatial and Radix Sort) from the Stanford SPLASH-2 suite to MapReduce. By completely evaluating them in Hadoop, an open-source MapReduce framework for clusters, we analyze the major performance bottleneck of them in the MapReduce framework. Based on this, we also provide several suggestions in enhancing the MapReduce framework to suite these applications.
Year
DOI
Venue
2009
10.1007/978-3-642-03644-6_35
APPT
Keywords
Field
DocType
application domain,data-parallel application,large cluster,water spatial,radix sort,stanford splash-2 suite,load balance,open-source mapreduce framework,mapreduce framework,splash-2 applications,fault tolerance,fault tolerant,machine learning
Bottleneck,Suite,Supercomputer,Load balancing (computing),Computer science,Parallel computing,Radix sort,Fault tolerance,Execution time,Distributed computing
Conference
Volume
ISSN
Citations 
5737
0302-9743
3
PageRank 
References 
Authors
0.37
13
6
Name
Order
Citations
PageRank
Shengkai Zhu130.37
Zhiwei Xiao281.47
Haibo Chen31749123.40
Rong Chen458630.22
Weihua Zhang517430.34
Binyu Zang698462.75