Title
Medusa: An Efficient Cloud Fault-Tolerant MapReduce
Abstract
Applications such as web search and social networking have been moving from centralized to decentralized cloud architectures to improve their scalability. MapReduce, a programming framework for processing large amounts of data using thousands of machines in a single cloud, also needs to be scaled out to multiple clouds to adapt to this evolution. The challenge of building a multi-cloud distributed architecture is substantial. Notwithstanding, the ability to deal with the new types of faults introduced by such setting, such as the outage of a whole datacenter or an arbitrary fault caused by a malicious cloud insider, increases the endeavor considerably. In this paper we propose Medusa, a platform that allows MapReduce computations to scale out to multiple clouds and tolerate several types of faults. Our solution fulfills four objectives. First, it is transparent to the user, who writes her typical MapReduce application without modification. Second, it does not require any modification to the widely used Hadoop framework. Third, the proposed system goes well beyond the fault-tolerance offered by MapReduce to tolerate arbitrary faults, cloud outages, and even malicious faults caused by corrupt cloud insiders. Fourth, it achieves this increased level of fault tolerance at reasonable cost. We performed an extensive experimental evaluation in the ExoGENI testbed, demonstrating that our solution significantly reduces execution time when compared to traditional methods that achieve the same level of resilience.
Year
DOI
Venue
2015
10.1109/CCGrid.2016.20
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
Keywords
DocType
Volume
MapReduce,Hadoop,Cloud,Fault-tolerance
Journal
abs/1511.07185
ISSN
ISBN
Citations 
2376-4414
978-1-5090-2454-4
4
PageRank 
References 
Authors
0.42
19
4
Name
Order
Citations
PageRank
Pedro Costa1242.13
xiao bai240.42
Fernando M. V. Ramos3115751.90
Miguel Correia4105671.21