Title
Ranking the Interestingness of Summaries from Data Mining Systems
Abstract
We study data rn~rdng where the task is description by summarization, the representation language is gen- eralized relations, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization al- gorithm for domain generalization graphs. We present and empirically compare four heuristics for ranking the interestingness of generalized relations (or summaries). The measures are based on common measures of the di- versity of a population, statistical variance, the Simp- son index, and the Shannon index. All four measures rank less complex summaries (i.e., those with few tu- ples and/or non-ANY attributes) as most interesting. Highly ranked summaries provide a reasonable starting point for fixrther analysis of discovered knowledge.
Year
Venue
Keywords
1999
FLAIRS Conference
data mining systems,indexation,data mining
Field
DocType
ISBN
Population,Data mining,Computer science,Heuristics,Artificial intelligence,Natural language processing,Automatic summarization,Graph,Heuristic,Information retrieval,Ranking,Tuple,Representation language
Conference
1-57735-080-4
Citations 
PageRank 
References 
8
2.91
16
Authors
3
Name
Order
Citations
PageRank
Robert J. Hilderman127029.86
Howard J. Hamilton21501145.55
Brock Barber3869.48