Title
Two time-efficient gibbs sampling inference algorithms for biterm topic model.
Abstract
Biterm Topic Model (BTM) is an effective topic model proposed to handle short texts. However, its standard gibbs sampling inference method (StdBTM) costs much more time than that (StdLDA) of Latent Dirichlet Allocation (LDA). To solve this problem we propose two time-efficient gibbs sampling inference methods, SparseBTM and ESparseBTM, for BTM by making a tradeoff between space and time consumption in this paper. The idea of SparseBTM is to reduce the computation in StdBTM by both recycling intermediate results and utilizing the sparsity of count matrix . Theoretically, SparseBTM reduces the time complexity of StdBTM from O(|| ) to O(|| ) which scales linearly with the sparsity of count matrix ( ) instead of the number of topics () ( < , is the average number of non-zero topics per word type in count matrix ). Experimental results have shown that in good conditions SparseBTM is approximately 18 times faster than StdBTM. Compared with SparseBTM, ESparseBTM is a more time-efficient gibbs sampling inference method proposed based on SparseBTM. The idea of ESparseBTM is to reduce more computation by recycling more intermediate results through rearranging biterm sequence. In theory, ESparseBTM reduces the time complexity of SparseBTM from O(|| ) to O(|| ) (0 < < 1, is the ratio of the number of biterm types to the number of biterms). Experimental results have shown that the percentage of the time efficiency improved by ESparseBTM on SparseBTM is between 6.4% and 39.5% according to different datasets.
Year
DOI
Venue
2018
https://doi.org/10.1007/s10489-017-1004-2
Appl. Intell.
Keywords
Field
DocType
Biterm topic model,Topic model,Latent Dirichlet allocation,Gibbs sampling
Latent Dirichlet allocation,Matrix (mathematics),Inference,Computer science,Spacetime,Artificial intelligence,Topic model,Time complexity,Machine learning,Gibbs sampling,Computation
Journal
Volume
Issue
ISSN
48
3
0924-669X
Citations 
PageRank 
References 
2
0.36
15
Authors
3
Name
Order
Citations
PageRank
Xiaotang Zhou1194.08
Jihong OuYang29415.66
Ximing Li34413.97