Abstract | ||
---|---|---|
In this paper we present a topic model basedapproach for classifying micro-blog posts into a giventopics of interests. The short nature of micro-blog postsmake them challenging for directly learning a classificationmodel. To overcome this limitation, we use content ofthe links embedded in these posts to improve the topiclearning. The hypothesis is that since the link content is farricher than the content of the post itself, using link contentalong with the content of the post will help learning. However, how this link content can be used to constructfeatures for classification remains a challenging issue. Furthermore, in previous methods, user based information isutilized in an ad-hoc manner that only work for certaintype of classification, such as characterizing content ofmicroblogs. In this paper, we propose supervised topicmodel, User-Labeled-LDA and its nonparametric variantthat can avoid the ad-hoc feature construction task andmodel the topics in a discriminative way. Our experimentson a Twitter dataset shows that modeling user interestsand link information helps in learning quality topics forsparse tweets as well as helps significantly in classificationtask. Our experiments further show that modeling thisinformation in a principled way through topic modelshelps more than simply adding this information through features. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/ICDM.2015.148 | IEEE International Conference on DataMining |
Keywords | Field | DocType |
microblog classification,supervised topic model,user-labeled-LDA,Twitter,tweets,topic classification,latent Dirichlet allocation | Data mining,Data modeling,Computer science,Artificial intelligence,Encyclopedia,Discriminative model,The Internet,Social media,Microblogging,Nonparametric statistics,Topic model,Machine learning,Electronic publishing | Conference |
ISSN | Citations | PageRank |
1550-4786 | 3 | 0.38 |
References | Authors | |
18 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Saurabh Kataria | 1 | 9 | 5.21 |
Arvind Agarwal | 2 | 93 | 10.11 |