A partially supervised cross-collection topic model for cross-domain text classification - Citegraph

Paper Info

Title
A partially supervised cross-collection topic model for cross-domain text classification

Abstract
Cross-domain text classification aims to automatically train a precise text classifier for a target domain by using labelled text data from a related source domain. To this end, one of the most promising ideas is to induce a new feature representation so that the distributional difference between domains can be reduced and a more accurate classifier can be learned in this new feature space. However, most existing methods do not explore the duality of the marginal distribution of examples and the conditional distribution of class labels given labeled training examples in the source domain. Besides, few previous works attempt to explicitly distinguish the domain-independent and domain-specific latent features and align the domain-specific features to further improve the cross-domain learning. In this paper, we propose a model called Partially Supervised Cross-Collection LDA topic model (PSCCLDA) for cross-domain learning with the purpose of addressing these two issues in a unified way. Experimental results on nine datasets show that our model outperforms two standard classifiers and four state-of-the-art methods, which demonstrates the effectiveness of our proposed model.

Year	DOI	Venue
2013	10.1145/2505515.2505556	CIKM
Keywords	Field	DocType
target domain,cross-domain text classification,precise text classifier,related source domain,supervised cross-collection topic model,accurate classifier,topic model,cross-domain learning,source domain,labelled text data,topic modeling	Data mining,Conditional probability distribution,Computer science,Duality (optimization),Artificial intelligence,Classifier (linguistics),Feature vector,Information retrieval,Pattern recognition,Topic model,Linear classifier,Marginal distribution,Machine learning	Conference
Citations	PageRank	References
18	0.62	18
Authors
3

Authors (3 rows)

Cited by (18 rows)

References (18 rows)

Name	Order	Citations	PageRank
Yang Bao	1	18	0.62
Nigel Collier	2	1164	96.59
Anindya Datta	3	842	127.21

1