Title
Cost Model and Analysis of Iterative MapReduce Applications for Hybrid Cloud Bursting.
Abstract
A popular and cost-effective way to deal with the increasing complexity of big data analytics is hybrid cloud bursting that leases temporary off-premise cloud resources to boost the overall capacity during peak utilization. The main challenge of hybrid cloud bursting is that the network link between the on-premise and the off-premise computational resources often exhibit high latency and low throughput (\"weak link\") compared to the links within the same data-center. This paper introduces a cost model that is specifically designed for iterative MapReduce applications running in a hybrid cloud bursting scenario, which are a popular class of large-scale data-intensive applications that provides near real-time responsiveness. Using this cost model, users can discover trends that can be leveraged to reason about how to balance performance, accuracy and cost such that it optimizes their requirements. We illustrated this approach through a cost analysis that focuses on two real-life iterative MapReduce applications using extensive horizontal scalability experiments that involve multiple hybrid cloud bursting strategies.
Year
DOI
Venue
2017
10.1109/CCGRID.2017.146
CCGrid
Keywords
Field
DocType
Hybrid Cloud, Cloud Bursting, Big Data Analytics, Cost Analysis, Cost Model, Data locality
Data modeling,Data transmission,Cloud bursting,Latency (engineering),Computer science,Throughput,Big data,Distributed computing,Scalability,Cloud computing
Conference
ISSN
Citations 
PageRank 
2376-4414
3
0.38
References 
Authors
10
3
Name
Order
Citations
PageRank
Francisco J. Clemente-Castelló1202.68
Rafael Mayo276276.75
Juan Carlos Fernández3729.77