Title
HDP-TUB Based Topic Mining Method for Chinese Micro-blogs.
Abstract
Topic models are important tools for mining the potential topics of text. However, the existing topic model is mostly derived from latent Dirichlet allocation (LDA), which requires the number of topics to be specified in advance. In order to mine the topic of Chines micro-blogs automatically, we propose a nonparametric Bayesian model, named HDP-TUB model, which is derived from hierarchical Dirichlet Process (HDP). In this model, we assume non-exchangeability of data, and use temporal information, user information and theme tags (TUB) to solve the sparsity problem caused by the short text. In order to construct the HDP-TUB model, the CRF (Chinese Restaurant Franchise) method is extended to integrate the temporal information, user information and topic tag information. Experiments show that the HDP-TUB model outperforms the LDA model and the HDP model in the perplexity and the difference between topics.
Year
DOI
Venue
2017
10.1007/978-3-319-73618-1_75
Lecture Notes in Artificial Intelligence
Keywords
DocType
Volume
Topic mining,HDP-TUB model,Hierarchical Dirichlet Process,Chinese Restaurant Franchise
Conference
10619
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Yaorong Zhang100.34
Bo Yang219453.08
Yi Li333.87
Yi Liu413154.73
Yangsen Zhang51112.10