Title
Multi-objective Topic Modeling.
Abstract
Topic Modeling (TM) is a rapidly-growing area at the interfaces of text mining, artificial intelligence and statistical modeling, that is being increasingly deployed to address the 'information overload' associated with extensive text repositories. The goal in TM is typically to infer a rich yet intuitive summary model of a large document collection, indicating a specific collection of topics that characterizes the collection - each topic being a probability distribution over words - along with the degrees to which each individual document is concerned with each topic. The model then supports segmentation, clustering, profiling, browsing, and many other tasks. Current approaches to TM, dominated by Latent Dirichlet Allocation (LDA), assume a topic-driven document generation process and find a model that maximizes the likelihood of the data with respect to this process. This is clearly sensitive to any mismatch between the 'true' generating process and statistical model, while it is also clear that the quality of a topic model is multi-faceted and complex. Individual topics should be intuitively meaningful, sensibly distinct, and free of noise. Here we investigate multi-objective approaches to TM, which attempt to infer coherent topic models by navigating the trade-offs between objectives that are oriented towards coherence as well as coverage of the corpus at hand. Comparisons with LDA show that adoption of MOEA approaches enables significantly more coherent topics than LDA, consequently enhancing the use and interpretability of these models in a range of applications, without significant degradation in generalization ability.
Year
DOI
Venue
2013
10.1007/978-3-642-37140-0_8
Lecture Notes in Computer Science
Keywords
Field
DocType
Multi-objective optimization,Topic Modeling,Latent Dirichlet Allocation,MOEA/D,Pointwise Mutual Information,Perplexity
Perplexity,Interpretability,Information overload,Latent Dirichlet allocation,Computer science,Artificial intelligence,Statistical model,Topic model,Cluster analysis,Pointwise mutual information,Machine learning
Conference
Volume
ISSN
Citations 
7811
0302-9743
3
PageRank 
References 
Authors
0.42
12
4
Name
Order
Citations
PageRank
Osama Khalifa130.42
David W. Corne22161152.00
m j chantler314824.14
Fraser Halley4242.10