Title
Kddlog:Performance And Scalability In Knowledge Discovery By Declarative Queries With Aggregates
Abstract
Demand for powerful, high-performance analytics on large Data Bases has been ever growing. Database Management Systems have long shown that descriptive analytics can be supported quite effectively by enriching traditional aggregates with constructs such as Data Cubes and other ROLAPs - thus extending the optimizability and parallelizability of RDBMS. In this paper, we show that these benefits can now be extended to predictive analytics, e.g. clustering, classification and association, by using aggregates in declarative recursive queries. Therefore, we introduce KDDLog, a scalable framework which leverages recursive queries with aggregates and our newly-proposed chain aggregates to enable users to build or customize knowledge discovery models with concise and expressive queries. We further propose specialized compilation techniques for seminaive fix-point computation in the presence of aggregates, and optimizations for complex recursive queries on distributed data platforms. We provide KDDLib to build knowledge discovery tasks and advanced interfaces to ease users of porting new models. Extensive evaluations on large-scale datasets demonstrate that our approach achieves promising performance gain while offering both increased generality and ease of programming knowledge discovery applications.
Year
DOI
Venue
2021
10.1109/ICDE51399.2021.00113
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021)
Keywords
DocType
ISSN
Declarative Knowledge Discovery, Chain Aggregates, Aggregates in Recursion
Conference
1084-4627
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Y. F. Li11128105.83
Jin Wang232988.76
Mingda Li3237.54
Ariyam Das4348.00
Jiaqi Gu5263.13
Carlo Zaniolo643051447.58