Title
Research on Mining Common Concern via Infinite Topic Modelling
Abstract
This paper focuses on mining common concern among different textual data sources and analyzing their own eigen topics via infinite topic modelling. By incorporating non-parametric Bayesian approaches, our work achieves a good performance and better accords with the reality by avoiding restrictive assumptions. We proposed extended processes of Dirichlet process(DP) -- bidirectional stick-breaking process and multi-branches process -- based on strick-breaking construction to model multiple sequences of probability measures in one process rather than simply combine several DPs. On the basis of this new perspective of DP, we discover the common topics and eigen topics via infinite topic modelling in a simple way without setting topic number. The experiments are carried out on three corpora of BBC news, about the UK, the US and China forum respectively. The results present the common concern of these three districts and their eigen interests in other aspects.
Year
DOI
Venue
2012
10.1109/WI-IAT.2012.159
Web Intelligence/IAT Workshops
Keywords
Field
DocType
common topic,eigen interest,probability measurement,extended process,multiple sequences,multi-branches process,us,bayes methods,common concern,dirichlet process,infinite topic,china,eigen topic,data analysis,multibranches process,hierarchical dirichlet process,dp,bidirectional stick-breaking process,infinite topic modelling,uk,data mining,textual data sources,mining common concern,common concern mining,nonparametric bayesian approaches,bbc news,probability,news
Data mining,Hierarchical Dirichlet process,Dirichlet process,Information retrieval,Computer science,Probability measure,Artificial intelligence,Topic model,Machine learning,Bayesian probability
Conference
Volume
ISBN
Citations 
3
978-1-4673-6057-9
0
PageRank 
References 
Authors
0.34
7
4
Name
Order
Citations
PageRank
Yishu Miao117811.44
Chun-Ping Li237459.17
Qiang Ding3111.93
Li Li400.34