Title
Text segmentation with LDA-based Fisher kernel
Abstract
In this paper we propose a domain-independent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic distribution, and we measure semantic similarity by the Fisher kernel. Finally global best segmentation is achieved by dynamic programming. Experiments on Chinese data sets with the technique show it can be effective. Introducing latent semantic information, our algorithm is robust on irregular-sized segments.
Year
Venue
Keywords
2008
ACL (Short Papers)
lda-based fisher kernel,semantic similarity,global best segmentation,domain-independent text segmentation method,introducing latent semantic information,fisher kernel,words semantic distribution,irregular-sized segment,chinese data set,latent dirichlet allocation,dynamic programming,text segmentation
Field
DocType
Volume
Semantic similarity,Latent Dirichlet allocation,Scale-space segmentation,Pattern recognition,Computer science,Segmentation,Explicit semantic analysis,Text segmentation,Probabilistic latent semantic analysis,Artificial intelligence,Fisher kernel,Machine learning
Conference
P08-2
Citations 
PageRank 
References 
21
1.16
7
Authors
4
Name
Order
Citations
PageRank
Qi Sun1211.16
Runxin Li2332.89
Dingsheng Luo34611.61
Xihong Wu427953.02