Title
A new fuzzy co-clustering algorithm for categorization of datasets with overlapping clusters
Abstract
Fuzzy co-clustering is a method that performs simultaneous fuzzy clustering of objects and features. In this paper, we introduce a new fuzzy co-clustering algorithm for high-dimensional datasets called Cosine-Distance-based & Dual-partitioning Fuzzy Co-clustering (CODIALING FCC). Unlike many existing fuzzy co-clustering algorithms, CODIALING FCC is a dual-partitioning algorithm. It clusters the features in the same manner as it clusters the objects, that is, by partitioning them according to their natural groupings. It is also a cosine-distance-based algorithm because it utilizes the cosine distance to capture the belongingness of objects and features in the co-clusters. Our main purpose of introducing this new algorithm is to improve the performance of some prominent existing fuzzy co-clustering algorithms in dealing with datasets with high overlaps. In our opinion, this is very crucial since most real-world datasets involve significant amount of overlaps in their inherent clustering structures. We discuss how this improvement can be made through the dual-partitioning formulation adopted. Experimental results on a toy problem and five large benchmark document datasets demonstrate the effectiveness of CODIALING FCC in handling overlaps better.
Year
DOI
Venue
2006
10.1007/11811305_36
ADMA
Keywords
Field
DocType
cosine-distance-based algorithm,codialing fcc,large benchmark document datasets,high-dimensional datasets,existing fuzzy co-clustering algorithm,overlapping cluster,new fuzzy co-clustering algorithm,simultaneous fuzzy clustering,new algorithm,dual-partitioning algorithm,fuzzy co-clustering,fuzzy clustering
Cluster (physics),Fuzzy clustering,Data mining,Computer science,Fuzzy set operations,Artificial intelligence,Biclustering,Cluster analysis,Categorization,Toy problem,Fuzzy logic,Algorithm,Machine learning
Conference
Volume
ISSN
ISBN
4093
0302-9743
3-540-37025-0
Citations 
PageRank 
References 
1
0.36
11
Authors
2
Name
Order
Citations
PageRank
William-Chandra Tjhi115610.09
Lihui Chen238027.30