Title | ||
---|---|---|
A new fuzzy co-clustering algorithm for categorization of datasets with overlapping clusters |
Abstract | ||
---|---|---|
Fuzzy co-clustering is a method that performs simultaneous fuzzy clustering of objects and features. In this paper, we introduce a new fuzzy co-clustering algorithm for high-dimensional datasets called Cosine-Distance-based & Dual-partitioning Fuzzy Co-clustering (CODIALING FCC). Unlike many existing fuzzy co-clustering algorithms, CODIALING FCC is a dual-partitioning algorithm. It clusters the features in the same manner as it clusters the objects, that is, by partitioning them according to their natural groupings. It is also a cosine-distance-based algorithm because it utilizes the cosine distance to capture the belongingness of objects and features in the co-clusters. Our main purpose of introducing this new algorithm is to improve the performance of some prominent existing fuzzy co-clustering algorithms in dealing with datasets with high overlaps. In our opinion, this is very crucial since most real-world datasets involve significant amount of overlaps in their inherent clustering structures. We discuss how this improvement can be made through the dual-partitioning formulation adopted. Experimental results on a toy problem and five large benchmark document datasets demonstrate the effectiveness of CODIALING FCC in handling overlaps better. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1007/11811305_36 | ADMA |
Keywords | Field | DocType |
cosine-distance-based algorithm,codialing fcc,large benchmark document datasets,high-dimensional datasets,existing fuzzy co-clustering algorithm,overlapping cluster,new fuzzy co-clustering algorithm,simultaneous fuzzy clustering,new algorithm,dual-partitioning algorithm,fuzzy co-clustering,fuzzy clustering | Cluster (physics),Fuzzy clustering,Data mining,Computer science,Fuzzy set operations,Artificial intelligence,Biclustering,Cluster analysis,Categorization,Toy problem,Fuzzy logic,Algorithm,Machine learning | Conference |
Volume | ISSN | ISBN |
4093 | 0302-9743 | 3-540-37025-0 |
Citations | PageRank | References |
1 | 0.36 | 11 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
William-Chandra Tjhi | 1 | 156 | 10.09 |
Lihui Chen | 2 | 380 | 27.30 |