Abstract | ||
---|---|---|
Attribute-oriented generalization summarizes theinformation in a relational database by repeatedly replacingspecific attribute values with more general concepts accordingto user-defined concept hierarchies. We introduce domaingeneralization graphs for controlling the generalization of aset of attributes and show how they are constructed. We thenpresent serial and parallel versions of the Multi-AttributeGeneralization algorithm for traversing the generalization statespace described by joining the domain generalization graphs formultiple attributes. Based upon a generate-and-test approach,the algorithm generates all possible summaries consistent withthe domain generalization graphs. Our experimental results showthat significant speedups are possible by partitioning pathcombinations from the DGGs across multiple processors. We alsorank the interestingness of the resulting summaries usingmeasures based upon variance and relative entropy. Ourexperimental results also show that these measures provide aneffective basis for analyzing summary data generated fromrelational databases. Variance appears more useful because ittends to rank the less complex summaries (i.e., those with fewattributes and/or tuples) as more interesting. |
Year | DOI | Venue |
---|---|---|
1999 | 10.1023/A:1008769516670 | J. Intell. Inf. Syst. |
Keywords | Field | DocType |
data mining,knowledge discovery,machine learning,knowledge representation,attribute-oriented generalization,domain generalization graphs | Data mining,Relational database,Computer science,Artificial intelligence,Hierarchy,Knowledge representation and reasoning,Generalization,Tuple,Knowledge extraction,State space,Database,Machine learning,Kullback–Leibler divergence | Journal |
Volume | Issue | ISSN |
13 | 3 | 1573-7675 |
Citations | PageRank | References |
21 | 1.57 | 31 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Robert J. Hilderman | 1 | 270 | 29.86 |
Howard J. Hamilton | 2 | 1501 | 145.55 |
Nick Cercone | 3 | 1999 | 570.62 |