Automatic Evaluation of Topic Coherence - Citegraph

Paper Info

Title
Automatic Evaluation of Topic Coherence

Abstract
This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a range of topic scoring models to the evaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In comparison with human scores for a set of learned topics over two distinct datasets, we show a simple co-occurrence measure based on pointwise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correlation, and that other Wikipedia-based lexical relatedness methods also achieve strong results. Google produces strong, if less consistent, results, while our results over WordNet are patchy at best.

Year	Venue	Keywords
2010	north american chapter of the association for computational linguistics	strong result,topic coherence evaluation,lexical similarity,google search engine,automatic evaluation,topic model,evaluation task,wikipedia-based lexical relatedness method,wikipedia data,distinct datasets,novel task
Field	DocType	ISBN
Normalized Google distance,Lexical similarity,Computer science,Natural language processing,Artificial intelligence,WordNet,Interpretability,Search engine,Information retrieval,Coherence (physics),Topic model,Pointwise mutual information,Machine learning	Conference	1-932432-65-5
Citations	PageRank	References
203	7.42	27
Authors
4

Search Limit

100203

Authors (4 rows)

Cited by (100 rows)

References (27 rows)

Name	Order	Citations	PageRank
David Newman	1	1319	73.72
Jey Han Lau	2	660	36.88
Karl Grieser	3	295	11.68
Timothy Baldwin	4	1767	116.85

1