Abstract | ||
---|---|---|
Document clustering is a significantly popular research, which aims to partition a corpus into many subgroups of homogeneous documents. Traditional clustering approaches catholically lack of considerations of word weights with clusters. To address this problem, we propose an Adaptive Centroid-based Clustering (ACC) algorithm. As a successful supervised centroid-based classifier, Class-Feature-Centroid (CFC) algorithm takes relationships among words into account. ACC attempts to employ this discriminative CFC vector to drive the clustering procedure. Since clustering is unsupervised, ACC begins with hundreds of small clusters for acceptable CFC vectors, and then iteratively regroups clusters of documents until convergence. As ACC is self-organized, it can determine the number of clusters adaptively. The experimental results validate that ACC achieves competitive performance with the state-of-art clustering approaches. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/PAAP.2014.13 | PAAP |
Keywords | Field | DocType |
pattern clustering,class-feature-centroid,acc algorithm,pattern classification,cfc vector,homogeneous documents,cfc algorithm,class-feature-centroid algorithm,corpus partition,document clustering,supervised centroid-based classifier,document handling,vectors,adaptively,adaptive centroid-based clustering algorithm,text document data,algorithm design and analysis,measurement,clustering algorithms,entropy,frequency modulation | k-medians clustering,Canopy clustering algorithm,Data mining,Fuzzy clustering,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Pattern recognition,Computer science,Artificial intelligence,Cluster analysis,Single-linkage clustering | Conference |
ISSN | Citations | PageRank |
2168-3034 | 1 | 0.36 |
References | Authors | |
4 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ximing Li | 1 | 44 | 13.97 |
Jihong OuYang | 2 | 94 | 15.66 |
Xiaotang Zhou | 3 | 19 | 4.08 |
Bo Fu | 4 | 5 | 2.79 |