Title
Collaborative accelerators for in-memory MapReduce on scale-up machines.
Abstract
Relying on efficient data analytics platforms is increasingly becoming crucial for both small and large scale datasets. While MapReduce implementations, such as Hadoop and Spark, were originally proposed for petascale processing in scale-out clusters, it has been noted that, today, most data centers processes operate on gigabyte-order or smaller datasets, which are best processed in single high-end scale-up machines. In this context, Phoenix++ is a highly optimized MapReduce framework available for chip-multiprocessor (CMP) scale-up machines. In this paper we observe that Phoenix++ suffers from an inefficient utilization of the memory subsystem, and a serialized execution of the MapReduce stages. To overcome these inefficiencies, we propose CASM, an architecture that equips each core in a CMP design with a dedicated instance of a specialized hardware unit (the CASM accelerators). These units collaborate to manage the key-value data structure and minimize both on- and off-chip communication costs. Our experimental evaluation on a 64-core design indicates that CASM provides more than a 4x speedup over the highly optimized Phoenix++ framework, while keeping area overhead at only 6%, and reducing energy demands by over 3.5x.
Year
DOI
Venue
2019
10.1145/3287624.3287636
ASP-DAC
Field
DocType
Citations 
Data structure,Architecture,Content-addressable memory,Spark (mathematics),Data analysis,Computer science,Implementation,Real-time computing,Petascale computing,Distributed computing,Speedup
Conference
1
PageRank 
References 
Authors
0.37
19
2
Name
Order
Citations
PageRank
Abraham Addisie151.79
Valeria Bertacco2136586.93