Title
Adaptive Centroid-Based Clustering Algorithm for Text Document Data
Abstract
Document clustering is a significantly popular research, which aims to partition a corpus into many subgroups of homogeneous documents. Traditional clustering approaches catholically lack of considerations of word weights with clusters. To address this problem, we propose an Adaptive Centroid-based Clustering (ACC) algorithm. As a successful supervised centroid-based classifier, Class-Feature-Centroid (CFC) algorithm takes relationships among words into account. ACC attempts to employ this discriminative CFC vector to drive the clustering procedure. Since clustering is unsupervised, ACC begins with hundreds of small clusters for acceptable CFC vectors, and then iteratively regroups clusters of documents until convergence. As ACC is self-organized, it can determine the number of clusters adaptively. The experimental results validate that ACC achieves competitive performance with the state-of-art clustering approaches.
Year
DOI
Venue
2014
10.1109/PAAP.2014.13
PAAP
Keywords
Field
DocType
pattern clustering,class-feature-centroid,acc algorithm,pattern classification,cfc vector,homogeneous documents,cfc algorithm,class-feature-centroid algorithm,corpus partition,document clustering,supervised centroid-based classifier,document handling,vectors,adaptively,adaptive centroid-based clustering algorithm,text document data,algorithm design and analysis,measurement,clustering algorithms,entropy,frequency modulation
k-medians clustering,Canopy clustering algorithm,Data mining,Fuzzy clustering,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Pattern recognition,Computer science,Artificial intelligence,Cluster analysis,Single-linkage clustering
Conference
ISSN
Citations 
PageRank 
2168-3034
1
0.36
References 
Authors
4
4
Name
Order
Citations
PageRank
Ximing Li14413.97
Jihong OuYang29415.66
Xiaotang Zhou3194.08
Bo Fu452.79