Title
Concise descriptions of subsets of structured sets
Abstract
We study the problem of economical representation of subsets of structured sets, which are sets equipped with a set cover or a family of preorders. Given a structured set U, and a language L whose expressions define subsets of U, the problem of minimum description length in L (L-MDL) is: “given a subset V of U, find a shortest string in L that defines V.” Depending on the structure and the language, the MDL-problem is in general intractable. We study the complexity of the MDL-problem for various structures and show that certain specializations are tractable. The families of focus are hierarchy, linear order, and their multidimensional extensions; these are found in the context of statistical and OLAP databases. In the case of general OLAP databases, data organization is a mixture of multidimensionality, hierarchy, and ordering, which can also be viewed naturally as a cover-structured ordered set. Efficient algorithms are provided for the MDL-problem for hierarchical and linearly ordered structures, and we prove that the multidimensional extensions are NP-complete. Finally, we illustrate the application of the theory to summarization of large result sets and (multi) query optimization for ROLAP queries.
Year
DOI
Venue
2005
10.1145/1061318.1061324
ACM Trans. Database Syst.
Keywords
DocType
Volume
query optimization,data organization,language l,minimal description length,general olap databases,set cover,large result set,olap databases,multidimensional extension,summarization,concise description,certain specialization,rolap query,olap,structured set,minimum description length,linear order
Journal
30
Issue
ISSN
Citations 
1
0362-5915
6
PageRank 
References 
Authors
0.52
15
2
Name
Order
Citations
PageRank
Ken Q. Pu134928.16
Alberto O. Mendelzon248481394.98