Title
MapReduce in GPI-Space.
Abstract
The computing power of modern high performance systems cannot be fully exploited using traditional parallel programming models. On the other hand, the growing demand for processing big data volumes requires a better control of the workflows, an efficient storage management, as well as a fault-tolerant runtime system. Trying to offer our proper solution to these problems, we designed and developed GPI-Space, a complex but flexible software development and execution platform, in which the data coordination of an application is decoupled from the programming of the algorithms. This allows the domain user to focus on the implementation of its problem only, while the fault tolerant runtime framework automatically runs the application in parallel in complex environments. We discuss the advantages and the disadvantages of our approach by comparison with the most popular MapReduce implementation, Hadoop. The tests performed on a multicore cluster with the wordcount use case showed that GPI-Space is almost three times faster than Hadoop when strictly the execution times are considered, and more than six times faster when the data loading time is also considered.
Year
DOI
Venue
2013
10.1007/978-3-642-54420-0_5
Lecture Notes in Computer Science
Field
DocType
Volume
Computer science,Parallel computing,Fault tolerance,Storage management,Big data,Workflow,Multi-core processor,Software development,Runtime system,Distributed computing
Conference
8374
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
8
3
Name
Order
Citations
PageRank
Tiberiu Rotaru1183.65
Mirko Rahn241.71
Franz-Josef Pfreundt36514.02