Title
A Random Walks Method for Text Classification
Abstract
Practical text classification system should be able to utilize information from both expensive labelled documents and large volumes of cheap unlabelled documents. It should also easily deal with newly input samples. In this paper, we propose a random walks method for text classification, in which the classification problem is formulated as solving the absorption probabilities of Markov random walks on a weighted graph. Then the Laplacian operator for asymmetric graphs is derived and utilized for asymmetric transition matrix. We also develop an induction algorithm for the newly input documents based on the random walks method. Meanwhile, to make full use of text information, a difference measure for text data based on language model and KL-divergence is proposed, as well as a new smoothing technique for it. Finally an algorithm for elimination of ambiguous states is proposed to address the problem of noisy data. Experiments on two well-known data sets: WebKB and 20Newsgroup demonstrate the effectivity of the proposed random walks method.
Year
DOI
Venue
2006
null
SIAM Proceedings Series
Keywords
Field
DocType
null
Random graph,Pattern recognition,Random walk,Computer science,Artificial intelligence,Machine learning
Conference
Volume
Issue
Citations 
2006
null
10
PageRank 
References 
Authors
1.03
12
3
Name
Order
Citations
PageRank
Yunpeng Xu1214.44
Xing Yi229320.64
Changshui Zhang35506323.40