Ranking the Interestingness of Summaries from Data Mining Systems - Citegraph

Paper Info

Title
Ranking the Interestingness of Summaries from Data Mining Systems

Abstract
We study data rn~rdng where the task is description by summarization, the representation language is gen- eralized relations, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization al- gorithm for domain generalization graphs. We present and empirically compare four heuristics for ranking the interestingness of generalized relations (or summaries). The measures are based on common measures of the di- versity of a population, statistical variance, the Simp- son index, and the Shannon index. All four measures rank less complex summaries (i.e., those with few tu- ples and/or non-ANY attributes) as most interesting. Highly ranked summaries provide a reasonable starting point for fixrther analysis of discovered knowledge.

Year	Venue	Keywords
1999	FLAIRS Conference	data mining systems,indexation,data mining
Field	DocType	ISBN
Population,Data mining,Computer science,Heuristics,Artificial intelligence,Natural language processing,Automatic summarization,Graph,Heuristic,Information retrieval,Ranking,Tuple,Representation language	Conference	1-57735-080-4
Citations	PageRank	References
8	2.91	16
Authors
3

Authors (3 rows)

Cited by (8 rows)

References (16 rows)

Name	Order	Citations	PageRank
Robert J. Hilderman	1	270	29.86
Howard J. Hamilton	2	1501	145.55
Brock Barber	3	86	9.48

1