Title
Data Mining in Large Databases Using Domain Generalization Graphs
Abstract
Attribute-oriented generalization summarizes theinformation in a relational database by repeatedly replacingspecific attribute values with more general concepts accordingto user-defined concept hierarchies. We introduce domaingeneralization graphs for controlling the generalization of aset of attributes and show how they are constructed. We thenpresent serial and parallel versions of the Multi-AttributeGeneralization algorithm for traversing the generalization statespace described by joining the domain generalization graphs formultiple attributes. Based upon a generate-and-test approach,the algorithm generates all possible summaries consistent withthe domain generalization graphs. Our experimental results showthat significant speedups are possible by partitioning pathcombinations from the DGGs across multiple processors. We alsorank the interestingness of the resulting summaries usingmeasures based upon variance and relative entropy. Ourexperimental results also show that these measures provide aneffective basis for analyzing summary data generated fromrelational databases. Variance appears more useful because ittends to rank the less complex summaries (i.e., those with fewattributes and/or tuples) as more interesting.
Year
DOI
Venue
1999
10.1023/A:1008769516670
J. Intell. Inf. Syst.
Keywords
Field
DocType
data mining,knowledge discovery,machine learning,knowledge representation,attribute-oriented generalization,domain generalization graphs
Data mining,Relational database,Computer science,Artificial intelligence,Hierarchy,Knowledge representation and reasoning,Generalization,Tuple,Knowledge extraction,State space,Database,Machine learning,Kullback–Leibler divergence
Journal
Volume
Issue
ISSN
13
3
1573-7675
Citations 
PageRank 
References 
21
1.57
31
Authors
3
Name
Order
Citations
PageRank
Robert J. Hilderman127029.86
Howard J. Hamilton21501145.55
Nick Cercone31999570.62