Title
Text segmentation via topic modeling: an analytical study
Abstract
In this paper, the task of text segmentation is approached from a topic modeling perspective. We investigate the use of latent Dirichlet allocation (LDA) topic model to segment a text into semantically coherent segments. A major benefit of the proposed approach is that along with the segment boundaries, it outputs the topic distribution associated with each segment. This information is of potential use in applications like segment retrieval and discourse analysis. The new approach outperforms a standard baseline method and yields significantly better performance than most of the available unsupervised methods on a benchmark dataset.
Year
DOI
Venue
2009
10.1145/1645953.1646170
CIKM
Keywords
Field
DocType
text segmentation,semantically coherent segment,analytical study,topic modeling perspective,topic model,potential use,segment retrieval,segment boundary,new approach,topic distribution,pattern recognition,discourse analysis,latent dirichlet allocation,dynamic programming
Dynamic topic model,Data mining,Dynamic programming,Latent Dirichlet allocation,Information retrieval,Computer science,Text segmentation,Discourse analysis,Artificial intelligence,Natural language processing,Topic model,Machine learning
Conference
Citations 
PageRank 
References 
40
1.43
11
Authors
4
Name
Order
Citations
PageRank
Hemant Misra117514.91
François Yvon2941102.51
Joemon M. Jose32782198.37
O. Cappe42112207.95