Title
CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters
Abstract
The use of asymmetric multi-core processors with on-chip computational accelerators is becoming common in a variety of environments ranging from scientific computing to enterprise applications. The focus of current research has been on making efficient use of individual systems, and porting applications to asymmetric processors. In this paper, we take the next step by investigating the use of multi-core-based systems, especially the popular Cell processor, in a cluster setting. We present CellMR, an efficient and scalable implementation of the MapReduce framework for asymmetric Cell-based clusters. The novelty of CellMR lies in its adoption of a streaming approach to supporting MapReduce, and its adaptive resource scheduling schemes: Instead of allocating workloads to the components once, CellMR slices the input into small work units and streams them to the asymmetric nodes for efficient processing. Moreover, CellMR removes I/O bottlenecks by design, using a number of techniques, such as double-buffering and asynchronous I/O, to maximize cluster performance. Our evaluation of CellMR using typical MapReduce applications shows that it achieves 50.5% better performance compared to the standard nonstreaming approach, introduces a very small overhead on the manager irrespective of application input size, scales almost linearly with increasing number of compute nodes (a speedup of 6.9 on average, when using eight nodes compared to a single node), and adapts effectively the parameters of its resource management policy between applications with varying computation density.
Year
DOI
Venue
2009
10.1109/IPDPS.2009.5161062
IPDPS
Keywords
Field
DocType
efficient processing,asymmetric node,adaptive resource scheduling scheme,typical mapreduce application,mapreduce framework,application input size,asymmetric cell-based cluster,asymmetric multi-core processor,efficient use,better performance,scientific computing,distributed computing,parallel programming,computational modeling,chip,resource allocation,resource manager,high performance computing,computer architecture,scheduling,double buffering,asynchronous i o,acceleration,multi core processor,resource management,data processing,multicore processing,programming
Asynchronous communication,Supercomputer,Computer science,Scheduling (computing),Parallel computing,Resource allocation,Asynchronous I/O,Multi-core processor,Scalability,Distributed computing,Speedup
Conference
ISSN
Citations 
PageRank 
1530-2075
30
1.72
References 
Authors
17
4
Name
Order
Citations
PageRank
M. Mustafa Rafique115715.49
Benjamin Rose2664.08
Ali R. Butt365147.51
Dimitrios S. Nikolopoulos41469128.40