Title | ||
---|---|---|
An Adaptive Framework for the Execution of Data-Intensive MapReduce Applications in the Cloud |
Abstract | ||
---|---|---|
Cloud computing technologies play an increasingly important role in realizing data-intensive applications by offering a virtualized compute and storage infrastructure that can scale on demand. A programming model that has gained a lot of interest in this context is MapReduce, which simplifies processing of large-scale distributed data volumes, usually on top of a distributed file system layer. In this paper we report on a self-configuring adaptive framework for developing and optimizing data-intensive scientific applications on top of Cloud and Grid computing technologies and the Hadoop framework. Our framework relies on a MAPE-K loop, known from autonomic computing, for optimizing the configuration of data-intensive applications at three abstraction layers: the application layer, the MapReduce layer, and the resource layer. By evaluating monitored resources, the framework configures the layers and allocates the resources on a per job basis. The evaluation of configurations relies on historic data and a utility function that ranks different configurations regarding to the arising costs. The optimization framework has been integrated in the Vienna Grid Environment (VGE), a service-oriented application development environment for providing applications on HPC systems, clusters and Clouds as services. An experimental evaluation of our framework has been undertaken with a data-analysis application from the field of molecular systems biology. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/IPDPS.2011.254 | IPDPS Workshops |
Keywords | Field | DocType |
application layer,hadoop framework,abstraction layer,data-intensive application,resource layer,optimization framework,data-intensive mapreduce applications,self-configuring adaptive framework,adaptive framework,file system layer,mapreduce layer,grid computing technology,cloud computing,distributed file system,programming model,autonomic computing,grid computing,application development,systems biology,xml,system biology,data analysis,distributed databases | Distributed File System,Autonomic computing,Application layer,Grid computing,Programming paradigm,Computer science,Distributed database,Grid,Distributed computing,Cloud computing | Conference |
Citations | PageRank | References |
3 | 0.45 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Martin Koehler | 1 | 56 | 8.05 |
Yuriy Kaniovskyi | 2 | 14 | 2.78 |
Siegfried Benkner | 3 | 614 | 67.47 |