Title
Energy efficiency for large-scale MapReduce workloads with significant interactive analysis
Abstract
MapReduce workloads have evolved to include increasing amounts of time-sensitive, interactive data analysis; we refer to such workloads as MapReduce with Interactive Analysis (MIA). Such workloads run on large clusters, whose size and cost make energy efficiency a critical concern. Prior works on MapReduce energy efficiency have not yet considered this workload class. Increasing hardware utilization helps improve efficiency, but is challenging to achieve for MIA workloads. These concerns lead us to develop BEEMR (Berkeley Energy Efficient MapReduce), an energy efficient MapReduce workload manager motivated by empirical analysis of real-life MIA traces at Facebook. The key insight is that although MIA clusters host huge data volumes, the interactive jobs operate on a small fraction of the data, and thus can be served by a small pool of dedicated machines; the less time-sensitive jobs can run on the rest of the cluster in a batch fashion. BEEMR achieves 40-50% energy savings under tight design constraints, and represents a first step towards improving energy efficiency for an increasingly important class of datacenter workloads.
Year
DOI
Venue
2012
10.1145/2168836.2168842
EuroSys
Keywords
Field
DocType
large-scale mapreduce workloads,real-life mia trace,mapreduce workloads,datacenter workloads,energy saving,energy efficient mapreduce workload,berkeley energy efficient mapreduce,mia clusters host,mapreduce energy efficiency,significant interactive analysis,mia workloads,energy efficiency,distributed systems,data analysis,energy efficient
Cluster (physics),Interactive analysis,Workload,Computer science,Efficient energy use,Real-time computing,Operating system,Distributed computing
Conference
Citations 
PageRank 
References 
92
3.03
31
Authors
4
Name
Order
Citations
PageRank
Yanpei Chen191741.46
Sara Alspaugh255322.91
Dhruba Borthakur3202280.76
Randy H. Katz4168193018.89