Abstract | ||
---|---|---|
Learned cardinality estimation (CE) has recently gained significant attention for replacing long-studied traditional CE with machine learning, especially for deep learning. However, these estimators were developed independently and have not been fairly or comprehensively compared in common settings. Most studies use a subset of IMDB data which is too simple to measure their limits and determine whether they are ready for real, complex data. Furthermore, they are regarded as black boxes, without a deep understanding of why large errors occur. In this paper, we first provide a taxonomy and a unified workflow of learned estimators for a better understanding of estimators. We next comprehensively compare recent learned CE methods that support joins, from a subset of tables to full IMDB and TPC-DS datasets. Under the experimental results, we then demystify the black-box models and analyze critical components that cause large errors. We also measure their impact on query optimization. Finally, based on the findings, we suggest realizable research opportunities. We believe that a deeper understanding of the behavior of existing methods can provide a more comprehensive and substantial framework for developing better estimators. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3514221.3526154 | PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22) |
Keywords | DocType | ISSN |
Cardinality estimation | Conference | 0730-8078 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kyoungmin Kim | 1 | 0 | 0.34 |
Jisung Jung | 2 | 0 | 0.34 |
In Seo | 3 | 0 | 0.34 |
Wook-Shin Han | 4 | 805 | 57.85 |
Kangwoo Choi | 5 | 0 | 0.34 |
Jaehyok Chong | 6 | 0 | 0.34 |