Title
Supervised cross-collection topic modeling
Abstract
Nowadays, vast amounts of multimedia data can be obtained across different collections (or domains). Therefore, it poses significant challenges for the utilization of those cross-collection data, for examples, the summarization of similarities and differences of data across different domains (e.g., CNN and NYT), as well as finding visually similar images across different visual domains (e.g., photos, paintings and hand-drawn sketches). In this paper, a supervised cross-collection Latent Dirichlet Allocation (scLDA) approach is proposed to utilize the data across different collections. As a natural extension of traditional Latent Dirichlet Allocation (LDA), scLDA not only takes the structural priors of different collections into consideration, but also exploits the category information. The strength of this work lies in integrating topic modeling, cross-domain learning and supervised learning together. We conduct scLDA for comparative text mining as well as classification of news articles and images from different collections. The results suggest that our proposed scLDA can generate meaningful collection-specific topics and achieves better retrieval accuracy than other related topic models.
Year
DOI
Venue
2012
10.1145/2393347.2396356
ACM Multimedia 2001
Keywords
Field
DocType
related topic model,cross-collection data,supervised cross-collection topic modeling,multimedia data,latent dirichlet allocation,meaningful collection-specific topic,different collection,cross-domain learning,different visual domain,proposed sclda,different domain,topic modeling
Dynamic topic model,Automatic summarization,Latent Dirichlet allocation,Information retrieval,Computer science,Supervised learning,Exploit,Topic model,Prior probability
Conference
Citations 
PageRank 
References 
8
0.56
5
Authors
6
Name
Order
Citations
PageRank
Haidong Gao1302.22
Siliang Tang217933.98
Yin Zhang33492281.04
Dapeng Jiang4112.39
Fei Wu52209153.88
Yue-Ting Zhuang63549216.06