Title
Statistical learning techniques for costing XML queries
Abstract
Developing cost models for query optimization is significantly harder for XML queries than for traditional relational queries. The reason is that XML query operators are much more complex than relational operators such as table scans and joins. In this paper, we propose a new approach, called COMET, to modeling the cost of XML operators; to our knowledge, COMET is the first method ever proposed for addressing the XML query costing problem. As in relational cost estimation, COMET exploits a set of system catalog statistics that summarizes the XML data; the set of "simple path" statistics that we propose is new, and is well suited to the XML setting. Unlike the traditional approach, COMET uses a new statistical learning technique called "transform regression" instead of detailed analytical models to predict the overall cost. Besides rendering the cost estimation problem tractable for XML queries, COMET has the further advantage of enabling the query optimizer to be self-tuning, automatically adapting to changes over time in the query workload and in the system environment. We demonstrate COMET's feasibility by developing a cost model for the recently proposed XNAV navigational operator. Empirical studies with synthetic, benchmark, and real-world data sets show that COMET can quickly obtain accurate cost estimates for a variety of XML queries and data sets.
Year
Venue
Keywords
2005
VLDB
relational cost estimation,xml query,xml setting,accurate cost estimate,overall cost,xml data,cost model,xml query operator,xml operator,cost estimation problem tractable,empirical study,cost estimation,query optimization
Field
DocType
ISBN
Query optimization,Data mining,Joins,Information retrieval,XML,Computer science,Cost estimate,Operator (computer programming),Relational operator,Activity-based costing,Rendering (computer graphics),Database
Conference
1-59593-154-6
Citations 
PageRank 
References 
42
1.72
19
Authors
5
Name
Order
Citations
PageRank
Ning Zhang138429.26
Peter J. Haas22799454.10
Vanja Josifovski32265148.84
Guy M. Lohman42846965.94
Chun Zhang52504166.15