Abstract | ||
---|---|---|
We study the problem of economical representation of subsets of structured sets, which are sets equipped with a set cover or a family of preorders. Given a structured set U, and a language L whose expressions define subsets of U, the problem of minimum description length in L (L-MDL) is: “given a subset V of U, find a shortest string in L that defines V.” Depending on the structure and the language, the MDL-problem is in general intractable. We study the complexity of the MDL-problem for various structures and show that certain specializations are tractable. The families of focus are hierarchy, linear order, and their multidimensional extensions; these are found in the context of statistical and OLAP databases. In the case of general OLAP databases, data organization is a mixture of multidimensionality, hierarchy, and ordering, which can also be viewed naturally as a cover-structured ordered set. Efficient algorithms are provided for the MDL-problem for hierarchical and linearly ordered structures, and we prove that the multidimensional extensions are NP-complete. Finally, we illustrate the application of the theory to summarization of large result sets and (multi) query optimization for ROLAP queries. |
Year | DOI | Venue |
---|---|---|
2005 | 10.1145/1061318.1061324 | ACM Trans. Database Syst. |
Keywords | DocType | Volume |
query optimization,data organization,language l,minimal description length,general olap databases,set cover,large result set,olap databases,multidimensional extension,summarization,concise description,certain specialization,rolap query,olap,structured set,minimum description length,linear order | Journal | 30 |
Issue | ISSN | Citations |
1 | 0362-5915 | 6 |
PageRank | References | Authors |
0.52 | 15 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ken Q. Pu | 1 | 349 | 28.16 |
Alberto O. Mendelzon | 2 | 4848 | 1394.98 |