Title
See spot run: using spot instances for mapreduce workflows
Abstract
MapReduce is a scalable and fault tolerant framework, patented by Google, for computing embarrassingly parallel reductions. Hadoop is an open-source implementation of Google MapReduce that is made available as a web service to cloud users by the AmazonWeb Services (AWS) cloud computing infrastructure. Amazon Spot Instances (SIs) provide an inexpensive yet transient and market-based option to purchasing virtualized instances for execution in AWS. As opposed to manually controlling when an instance is terminated, SI termination can also occur automatically as a function of the market price and maximum user bid price. We find that we can significantly improve the runtime of MapReduce jobs in our benchmarks by using SIs as accelerators. However, we also find that SI termination due to budget constraints during the job can have adverse affects on the runtime and may cause the user to overpay for their job. We describe new techniques that help reduce such effects.
Year
Venue
Keywords
2010
HotCloud
maximum user bid price,amazonweb services,google mapreduce,cloud computing infrastructure,spot instance,budget constraint,spot run,fault tolerant framework,amazon spot instances,market price,si termination,mapreduce job,mapreduce workflows
Field
DocType
Citations 
Computer science,Embarrassingly parallel,Real-time computing,Fault tolerance,Purchasing,Web service,Workflow,Operating system,Cloud computing,Scalability,Bid price,Distributed computing
Conference
87
PageRank 
References 
Authors
5.22
6
6
Name
Order
Citations
PageRank
Navraj Chohan119514.43
Claris Castillo223114.93
Mike Spreitzer32178451.09
Malgorzata Steinder4101665.74
Asser N. Tantawi51055141.98
Chandra Krintz681280.49