Title
A framework to support multiple query optimization for complex mining tasks
Abstract
With an increasing use of data mining tools and techniques, we envision that a Knowledge Discovery and Data Mining System (KDDMS) will have to support and optimize for the following scenarios: 1) Sequence of Queries: A user may analyze one or more datasets by issuing a sequence of related complex mining queries, and 2) Multiple Simultaneous Queries: Several users may be analyzing a set of datasets concurrently, and may issue related complex queries.This paper presents a systematic mechanism to optimize for the above cases, targetting the class of mining queries involving frequent pattern mining on one or multiple datasets. We present a system architecture and propose new algorithms for this purpose. We show the design of a knowledgeable cache which can store the past query results from queries on multiple datasets. We present algorithms which enable the use of the results stored in such a cache to further optimize multiple queries.We have implemented and evaluated our system with both real and synthetic datasets. Our experimental results show that our techniques can achieve a speedup of up to a factor of 9, compared with the systems which do not support caching or optimize for multiple queries.
Year
DOI
Venue
2005
10.1145/1133890.1133893
MDM@KDD
Field
DocType
ISBN
Query optimization,Data mining,Data stream mining,Computer science,Cache,Complex data type,Artificial intelligence,Knowledge extraction,Systems architecture,Machine learning,Speedup
Conference
1-59593-216-X
Citations 
PageRank 
References 
3
0.36
29
Authors
3
Name
Order
Citations
PageRank
Ruoming Jin1163791.73
Kaushik Sinha224417.81
Gagan Agrawal32058209.59