Title
Possibilistic fuzzy co-clustering of large document collections
Abstract
In this paper we propose a new co-clustering algorithm called possibilistic fuzzy co-clustering (PFCC) for automatic categorization of large document collections. PFCC integrates a possibilistic document clustering technique and a combined formulation of fuzzy word ranking and partitioning into a fast iterative co-clustering procedure. This novel framework brings about simultaneously some benefits including robustness in the presence of document and word outliers, rich representations of co-clusters, highly descriptive document clusters, a good performance in a high-dimensional space, and a reduced sensitivity to the initialization in the possibilistic clustering. We present the detailed formulation of PFCC together with the explanations of the motivations behind. The advantages over other existing works and the algorithm's proof of convergence are provided. Experiments on several large document data sets demonstrate the effectiveness of PFCC.
Year
DOI
Venue
2007
10.1016/j.patcog.2007.04.017
Pattern Recognition
Keywords
Field
DocType
possibilistic fuzzy co-clustering,large document collection,descriptive document cluster,possibilistic document,combined formulation,possibilistic clustering,fuzzy word ranking,detailed formulation,new co-clustering algorithm,large document data set,fuzzy clustering,co clustering,text mining,document clustering,information retrieval
Fuzzy clustering,Data mining,Ranking,Document clustering,Fuzzy logic,Possibility theory,Artificial intelligence,Initialization,Biclustering,Cluster analysis,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
40
12
Pattern Recognition
Citations 
PageRank 
References 
14
0.68
23
Authors
2
Name
Order
Citations
PageRank
William-Chandra Tjhi115610.09
Lihui Chen238027.30