Fast. Efficient Performance Predictions for Big Data Applications - Citegraph

Paper Info

Title
Fast. Efficient Performance Predictions for Big Data Applications

Abstract
In recent years we observe a rapid growth in the deployment of machine learning workloads on big data analytics frameworks like Apache Spark and Apache Flink. These workloads are typically represented as graphs, run on shared infrastructures and often have much more demanding resource requirements than those traditionally found in typical enterprise settings. However, predicting the execution times of the workloads is important as they often run on shared public or private infrastructures and, thus, their execution is greatly affected by the resource sharing, the hardware infrastructure utilized as well as the choice of the configuration parameters provided by the frameworks. In this work, we propose a fast and efficient performance prediction system to address the challenge of predicting the execution times of big data workloads, exploiting the fact that workloads are represented as processing graphs and often share similar structures and parameters. Thus, we can use the performance models we have built for already deployed workloads, to estimate the end-to-end execution time for a new workload. Previous works assume that a large number of profiling runs can be utilized for building the prediction models. However, this assumption is not always valid and more elaborate mechanisms need to be applied. Our detailed experimental evaluation on our local Spark cluster illustrates that our approach can predict accurately the execution time of a wide range of Spark workloads.

Year	DOI	Venue
2019	10.1109/ISORC.2019.00034	2019 IEEE 22nd International Symposium on Real-Time Distributed Computing (ISORC)
Keywords	DocType	ISSN
Graph Similarity,Predictions,Distributed Systems	Conference	1555-0885
ISBN	Citations	PageRank
978-1-7281-0152-1	0	0.34
References	Authors
20	4

Authors (4 rows)

Cited by (0 rows)

References (20 rows)

Name	Order	Citations	PageRank
Stathis Maroulis	1	5	2.07
Nikos Zacheilas	2	79	9.40
Thanasis Theocharis	3	0	0.34
Vana Kalogeraki	4	1686	124.40

1