Title
Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data.
Abstract
Twitter offers scholars new ways to understand the dynamics of public opinion and social discussions. However, in order to understand such discussions, it is necessary to identify coherent topics that have been discussed in the tweets. To assess the coherence of topics, several automatic topic coherence metrics have been designed for classical document corpora. However, it is unclear how suitable these metrics are for topic models generated from Twitter datasets. In this paper, we use crowdsourcing to obtain pairwise user preferences of topical coherences and to determine how closely each of the metrics align with human preferences. Moreover, we propose two new automatic coherence metrics that use Twitter as a separate background dataset to measure the coherence of topics. We show that our proposed Pointwise Mutual Information-based metric provides the highest levels of agreement with human preferences of topic coherence over two Twitter datasets.
Year
Venue
Field
2016
ECIR
Data science,Semantic similarity,Data mining,Pairwise comparison,Latent Dirichlet allocation,Information retrieval,Crowdsourcing,Computer science,Coherence (physics),Topic model,Latent semantic analysis,Pointwise mutual information
DocType
Citations 
PageRank 
Conference
7
0.58
References 
Authors
11
4
Name
Order
Citations
PageRank
Anjie Fang1355.93
Craig Macdonald22588178.50
Iadh Ounis33438234.59
Philip Habel4342.88