Title | ||
---|---|---|
Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts. |
Abstract | ||
---|---|---|
A text corpus typically contains two types of context information -- global context and local context. Global context carries topical information which can be utilized by topic models to discover topic structures from the text corpus, while local context can train word embeddings to capture semantic regularities reflected in the text corpus. This encourages us to exploit the useful information in both the global and the local context information. In this paper, we propose a unified language model based on matrix factorization techniques which 1) takes the complementary global and local context information into consideration simultaneously, and 2) models topics and learns word embeddings collaboratively. We empirically show that by incorporating both global and local context, this collaborative model can not only significantly improve the performance of topic discovery over the baseline topic models, but also learn better word embeddings than the baseline word embedding models. We also provide qualitative analysis that explains how the cooperation of global and local context information can result in better topic structures and word embeddings.
|
Year | DOI | Venue |
---|---|---|
2017 | 10.1145/3097983.3098009 | KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Halifax
NS
Canada
August, 2017 |
Keywords | Field | DocType |
Topic modeling,word embeddings,global context,local context,unified language model | Data mining,Computer science,Collaborative model,Context model,Natural language processing,Artificial intelligence,Word embedding,Language model,Matrix decomposition,Text corpus,Exploit,Topic model,Machine learning | Conference |
ISBN | Citations | PageRank |
978-1-4503-4887-4 | 10 | 0.53 |
References | Authors | |
16 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guangxu Xun | 1 | 109 | 11.89 |
yaliang li | 2 | 629 | 50.87 |
Jing Gao | 3 | 2723 | 131.05 |
Aidong Zhang | 4 | 2970 | 405.63 |