A Random Walks Method for Text Classification - Citegraph

Paper Info

Title
A Random Walks Method for Text Classification

Abstract
Practical text classification system should be able to utilize information from both expensive labelled documents and large volumes of cheap unlabelled documents. It should also easily deal with newly input samples. In this paper, we propose a random walks method for text classification, in which the classification problem is formulated as solving the absorption probabilities of Markov random walks on a weighted graph. Then the Laplacian operator for asymmetric graphs is derived and utilized for asymmetric transition matrix. We also develop an induction algorithm for the newly input documents based on the random walks method. Meanwhile, to make full use of text information, a difference measure for text data based on language model and KL-divergence is proposed, as well as a new smoothing technique for it. Finally an algorithm for elimination of ambiguous states is proposed to address the problem of noisy data. Experiments on two well-known data sets: WebKB and 20Newsgroup demonstrate the effectivity of the proposed random walks method.

Year	DOI	Venue
2006	null	SIAM Proceedings Series
Keywords	Field	DocType
null	Random graph,Pattern recognition,Random walk,Computer science,Artificial intelligence,Machine learning	Conference
Volume	Issue	Citations
2006	null	10
PageRank	References	Authors
1.03	12	3

Authors (3 rows)

Cited by (10 rows)

References (12 rows)

Name	Order	Citations	PageRank
Yunpeng Xu	1	21	4.44
Xing Yi	2	293	20.64
Changshui Zhang	3	5506	323.40

1