Title
Driving big data with big compute
Abstract
Big Data (as embodied by Hadoop clusters) and Big Compute (as embodied by MPI clusters) provide unique capabilities for storing and processing large volumes of data. Hadoop clusters make distributed computing readily accessible to the Java community and MPI clusters provide high parallel efficiency for compute intensive workloads. Bringing the big data and big compute communities together is an active area of research. The LLGrid team has developed and deployed a number of technologies that aim to provide the best of both worlds. LLGrid MapReduce allows the map/reduce parallel programming model to be used quickly and efficiently in any language on any compute cluster. D4M (Dynamic Distributed Dimensional Data Model) provided a high level distributed arrays interface to the Apache Accumulo database. The accessibility of these technologies is assessed by measuring the effort to use these tools and is typically a few lines of code. The performance is assessed by measuring the insert rate into the Accumulo database. Using these tools a database insert rate of 4M inserts/second has been achieved on an 8 node cluster.
Year
DOI
Venue
2012
10.1109/HPEC.2012.6408678
HPEC
Keywords
DocType
ISSN
node cluster,database management systems,java community,application program interfaces,big data,big compute,d4m,dynamic distributed dimensional data model,parallel programming,mpi clusters,llgridmapreduce,high level distributed arrays interface,hadoop clusters,scheduler,data models,data handling,distributed computing,concurrent query,apache accumulo database,message passing,map-reduce parallel programming model,parallel matlab,java,parallel ingestion,hdfs,llgrid mapreduce
Conference
2377-6943
ISBN
Citations 
PageRank 
978-1-4673-1577-7
12
0.77
References 
Authors
0
14
Name
Order
Citations
PageRank
Chansup Byun118019.21
William Arcand217517.77
David Bestor318119.08
Bill Bergeron416816.57
Matthew Hubbell519220.93
Jeremy Kepner660661.58
Andrew McCabe7120.77
Peter Michaleas820120.93
Julie Mullen913815.22
David O'Gwynn10353.42
Andrew Prout1118218.78
Albert Reuther1233537.32
Antonio Rosa1317017.67
Charles Yee1414715.14