Title
Concise descriptions of subsets of structured sets
Abstract
We study the problem of economical representation of subsets of structured sets, that is, sets equipped with a set cover. Given a structured set U, and a language L whose expressions define subsets of U, the problem of Minimum Description Length in L (L-MDL) is: "given a subset V of U, find a shortest string in L that defines V".We show that the simple set cover is enough to model a number of realistic database structures. We focus on two important families: hierarchical and multidimensional organizations. The former is found in the context of semistructured data such as XML, the latter in the context of statistical and OLAP databases. In the case of general OLAP databases, data organization is a mixture of multidimensionality and hierarchy, which can also be viewed naturally as a structured set. We study the complexity of the L-MDL problem in several settings, and provide an efficient algorithm for the hierarchical case.Finally, we illustrate the application of the theory to summarization of large result sets, (multi) query optimization for ROLAP queries, and XML queries.
Year
DOI
Venue
2003
10.1145/773153.773166
Symposium on Principles of Database Systems
Keywords
DocType
ISBN
hierarchical case,olap databases,large result set,xml query,data organization,language l,concise description,simple set cover,l-mdl problem,general olap databases,structured set,minimum description length,query optimization,set cover
Conference
1-58113-670-6
Citations 
PageRank 
References 
8
0.63
14
Authors
2
Name
Order
Citations
PageRank
Alberto O. Mendelzon148481394.98
Ken Q. Pu234928.16