Abstract | ||
---|---|---|
In recent years we observe a rapid growth in the deployment of machine learning workloads on big data analytics frameworks like Apache Spark and Apache Flink. These workloads are typically represented as graphs, run on shared infrastructures and often have much more demanding resource requirements than those traditionally found in typical enterprise settings. However, predicting the execution times of the workloads is important as they often run on shared public or private infrastructures and, thus, their execution is greatly affected by the resource sharing, the hardware infrastructure utilized as well as the choice of the configuration parameters provided by the frameworks. In this work, we propose a fast and efficient performance prediction system to address the challenge of predicting the execution times of big data workloads, exploiting the fact that workloads are represented as processing graphs and often share similar structures and parameters. Thus, we can use the performance models we have built for already deployed workloads, to estimate the end-to-end execution time for a new workload. Previous works assume that a large number of profiling runs can be utilized for building the prediction models. However, this assumption is not always valid and more elaborate mechanisms need to be applied. Our detailed experimental evaluation on our local Spark cluster illustrates that our approach can predict accurately the execution time of a wide range of Spark workloads. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ISORC.2019.00034 | 2019 IEEE 22nd International Symposium on Real-Time Distributed Computing (ISORC) |
Keywords | DocType | ISSN |
Graph Similarity,Predictions,Distributed Systems | Conference | 1555-0885 |
ISBN | Citations | PageRank |
978-1-7281-0152-1 | 0 | 0.34 |
References | Authors | |
20 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Stathis Maroulis | 1 | 5 | 2.07 |
Nikos Zacheilas | 2 | 79 | 9.40 |
Thanasis Theocharis | 3 | 0 | 0.34 |
Vana Kalogeraki | 4 | 1686 | 124.40 |