Abstract | ||
---|---|---|
LDA (Latent Dirichlet Allocation) proposed by Blei is a generative probabilistic model of a corpus, where documents are represented as random mixtures over latent topics, and each topic is characterized by a distribution over words, but not the attributes of word positions of every document in the corpus. In this paper, a Word Position-Related LDA Model is proposed taking into account the attributes of word positions of every document in the corpus, where each word is characterized by a distribution over word positions. At the same time, the precision of the topic-word's interpretability is improved by integrating the distribution of the word-position and the appropriate word degree, taking into account the different word degree in the different word positions. Finally, a new method, a size-aware word intrusion method is proposed to improve the ability of the topic-word's interpretability. Experimental results on the NIPS corpus show that the Word Position-Related LDA Model can improve the precision of the topic-word's interpretability. And the average improvement of the precision in the topic-word's interpretability is about 9.67%. Also, the size-aware word intrusion method can interpret the topic-word's semantic information more comprehensively and more effectively through comparing the different experimental data. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1142/S0218001411008890 | INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE |
Keywords | Field | DocType |
LDA, probabilistic topic models, word position, word degree, word intrusion | Latent Dirichlet allocation,Experimental data,Computer science,Artificial intelligence,Natural language processing,Interpretability,Intrusion,tf–idf,Pattern recognition,Word error rate,Statistical model,Generative grammar,Machine learning | Journal |
Volume | Issue | ISSN |
25 | 6 | 0218-0014 |
Citations | PageRank | References |
3 | 0.36 | 7 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lidong Zhai | 1 | 23 | 5.97 |
Zhaoyun Ding | 2 | 29 | 5.90 |
Yan Jia | 3 | 56 | 10.52 |
Bin Zhou | 4 | 341 | 30.99 |