Abstract | ||
---|---|---|
MapReduce is a scalable and fault tolerant framework, patented by Google, for computing embarrassingly parallel reductions. Hadoop is an open-source implementation of Google MapReduce that is made available as a web service to cloud users by the AmazonWeb Services (AWS) cloud computing infrastructure. Amazon Spot Instances (SIs) provide an inexpensive yet transient and market-based option to purchasing virtualized instances for execution in AWS. As opposed to manually controlling when an instance is terminated, SI termination can also occur automatically as a function of the market price and maximum user bid price. We find that we can significantly improve the runtime of MapReduce jobs in our benchmarks by using SIs as accelerators. However, we also find that SI termination due to budget constraints during the job can have adverse affects on the runtime and may cause the user to overpay for their job. We describe new techniques that help reduce such effects. |
Year | Venue | Keywords |
---|---|---|
2010 | HotCloud | maximum user bid price,amazonweb services,google mapreduce,cloud computing infrastructure,spot instance,budget constraint,spot run,fault tolerant framework,amazon spot instances,market price,si termination,mapreduce job,mapreduce workflows |
Field | DocType | Citations |
Computer science,Embarrassingly parallel,Real-time computing,Fault tolerance,Purchasing,Web service,Workflow,Operating system,Cloud computing,Scalability,Bid price,Distributed computing | Conference | 87 |
PageRank | References | Authors |
5.22 | 6 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Navraj Chohan | 1 | 195 | 14.43 |
Claris Castillo | 2 | 231 | 14.93 |
Mike Spreitzer | 3 | 2178 | 451.09 |
Malgorzata Steinder | 4 | 1016 | 65.74 |
Asser N. Tantawi | 5 | 1055 | 141.98 |
Chandra Krintz | 6 | 812 | 80.49 |