Title
Performance prediction for set similarity joins
Abstract
Query performance prediction is essential for many important tasks in cloud-based database management including resource provisioning, admission control, and pricing. Recently, there has been some work on building prediction models to estimate execution time of traditional SQL queries. While suitable for typical OLTP/OLAP workloads, these existing approaches are insufficient to model performance of complex data processing activities for deep analytics such as cleaning and integration of data. These activities are largely based on similarity operations---radically different from regular relational operators. In this paper, we consider prediction models for set similarity joins. We exploit knowledge of optimization techniques and design details popularly found in set similarity join algorithms to identify relevant features, which are then used to construct prediction models based on statistical machine learning. An extensive experimental evaluation confirms the accuracy of our approach.
Year
DOI
Venue
2015
10.1145/2695664.2695694
SAC 2015: Symposium on Applied Computing Salamanca Spain April, 2015
Keywords
Field
DocType
Set Similarity Join, Performance Prediction, Cloud Databases
SQL,Data mining,Joins,Admission control,Computer science,Online transaction processing,Relational operator,Analytics,Online analytical processing,Performance prediction
Conference
ISBN
Citations 
PageRank 
978-1-4503-3196-8
2
0.37
References 
Authors
13
4
Name
Order
Citations
PageRank
Christiane Faleiro Sidney120.37
Diego Sarmento Mendes220.37
Leonardo Andrade Ribeiro3458.62
Theo Härder41132307.12