Title
Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime.
Abstract
The rise of Internet of Things sensors, social networking and mobile devices has led to an explosion of available data. Gaining insights into this data has led to the area of Big Data analytics. The MapReduce framework, as implemented in Hadoop, is one of the most popular frameworks for Big Data analysis. To handle the ever-increasing data size, Hadoop is a scalable framework that allows dedicated, seemingly unbound numbers of servers to participate in the analytics process. Response time of an analytics request is an important factor for time to value/insights. While the compute and disk I/O requirements can be scaled with the number of servers, scaling the system leads to increased network traffic. Arguably, the communication-heavy phase of MapReduce contributes significantly to the overall response time; the problem is further aggravated, if communication patterns are heavily skewed, as is not uncommon in many MapReduce workloads. In this paper we present a system that reduces the skew impact by transparently predicting data communication volume at runtime and mapping the many end-to-end flows among the various processes to the underlying network, using emerging software-defined networking technologies to avoid hotspots in the network. Dependent on the network oversubscription ratio, we demonstrate reduction in job completion time between 3% and 46% for popular MapReduce benchmarks like Sort and Nutch.
Year
DOI
Venue
2014
10.1109/IPDPS.2014.20
IPDPS
Keywords
Field
DocType
resource management,big data,computer networks,data processing,parallel programming,big data analytics,response time,distributed computing,job shop scheduling,routing,servers
Computer science,sort,Parallel computing,Server,Computer network,Response time,Mobile device,Software-defined networking,Analytics,Big data,Distributed computing,Scalability
Conference
ISSN
Citations 
PageRank 
1530-2075
12
0.62
References 
Authors
12
4
Name
Order
Citations
PageRank
Marcelo Veiga Neves1493.80
César A. F. De Rose2198287.05
Kostas Katrinis310219.41
Hubertus Franke41257104.86