Title
Utilizing Different Link Types to Enhance Document Clustering Based on Markov Random Field Model With Relaxation Labeling
Abstract
With the fast growing number of works utilizing link information in enhancing unsupervised document clustering, it is becoming necessary to make a comparative evaluation of the impacts of different link types on document clustering. Various types of links between text documents, including explicit links such as citation links and hyperlinks, implicit links such as coauthorship and cocitation links, and similarity links such as content similarity links, convey topic similarity or topic transferring patterns, which is very useful for document clustering. In this paper, we adopt a clustering algorithm based on Markov random field and relaxation labeling, which employs both content and linkage information, to evaluate the effectiveness of the aforementioned types of links for document clustering on ten data sets. The experimental results show that linkage information is quite effective in improving content-based document clustering. Furthermore, a series of important findings regarding the impacts of different link types on document clustering is discovered through our experiments.
Year
DOI
Venue
2012
10.1109/TSMCA.2012.2187183
IEEE Transactions on Systems, Man, and Cybernetics, Part A
Keywords
Field
DocType
cocitation links,markov random field model,pattern clustering,link type,clustering algorithm,random processes,topic similarity,relaxation labeling,markov processes,text documents,content-based document clustering,hyperlinks,markov random field (mrf),document clustering enhancement,linkage information,link-based document clustering,relaxation labeling (rl),unsupervised document clustering,coauthorship links,content information,implicit links,content similarity links,text analysis,citation analysis,link information,topic transferring pattern
Data mining,Fuzzy clustering,Data stream clustering,Correlation clustering,Document clustering,Computer science,Consensus clustering,Artificial intelligence,Conceptual clustering,Cluster analysis,Brown clustering,Machine learning
Journal
Volume
Issue
ISSN
42
5
1083-4427
Citations 
PageRank 
References 
1
0.35
23
Authors
5
Name
Order
Citations
PageRank
Xiaodan Zhang110.35
Xiaohua Hu22819314.15
Tingting Hu36311.93
E. K. Park410.35
Xiaohua Zhou543825.82