Towards a Collective Layer in the Big Data Stack - Citegraph

Paper Info

Title
Towards a Collective Layer in the Big Data Stack

Abstract
We generalize MapReduce, Iterative MapReduce and data intensive MPI runtime as a layered Map-Collective architecture with Map-All Gather, Map-All Reduce, MapReduce Merge Broadcast and Map-Reduce Scatter patterns as the initial focus. Map-collectives improve the performance and efficiency of the computations while at the same time facilitating ease of use for the users. These collective primitives can be applied to multiple runtimes and we propose building high performance robust implementations that cross cluster and cloud systems. Here we present results for two collectives shared between Hadoop (where we term our extension H-Collectives) on clusters and the Twister4Azure Iterative MapReduce for the Azure Cloud. Our prototype implementations of Map-All Gather and Map-All Reduce primitives achieved up to 33% performance improvement for K-means Clustering and up to 50% improvement for Multi-Dimensional Scaling, while also improving the user friendliness. In some cases, use of Map-collectives virtually eliminated almost all the overheads of the computations.

Year	DOI	Venue
2014	10.1109/CCGrid.2014.123	Cluster, Cloud and Grid Computing
Keywords	DocType	ISSN
Big Data,application program interfaces,cloud computing,distributed programming,message passing,pattern clustering,public domain software,Big Data stack,Hadoop,Map-All Gather primitives,Map-All Reduce primitives,MapReduce merge broadcast,MapReduce scatter patterns,Twister4Azure iterative MapReduce,cloud systems,collective layer,collective primitives,cross cluster,data intensive MPI runtime,k-means clustering,layered map-collective architecture,multidimensional scaling,user friendliness,Cloud,Collectives,HPC,K-means,MDS,MapReduce,Performance,Twister	Conference	2376-4414
Citations	PageRank	References
4	0.39	11
Authors
3

Authors (3 rows)

Cited by (4 rows)

References (11 rows)

Name	Order	Citations	PageRank
Thilina Gunarathne	1	744	38.87
Judy Qiu	2	743	43.25
Dennis Gannon	3	2514	330.26

1