Title
Ensemble Methods in the Clustering of String Patterns
Abstract
We address the problem of clustering of contour images from hardware tools based on string descriptions, in a comparative study of cluster combination techniques. Several clustering algorithms are addressed using both the hierarchical agglomerative concept and partitional approaches. In the later class of algorithms, we explore: an adaptation of the K-means algorithm to string patterns using the median string as cluster representative; the error-correcting parsing approach by Fu; and the very recent spectral clustering approach. These algorithms are applied using several dissimilarity measures, namely: minimum code length based measures; dissimilarity based on the concept of reduction in grammatical complexity; and error-correcting parsing. In a first instance, clustering algorithms are applied individually to the image data set, and results are evaluated in terms of the error rate, taking as ground truth known labeling of the data. In a second step, we combine multiple data partitions, that we call a clustering ensemble, using three state-of-the-art clustering combination techniques. Results show that combination methods lead in general to better data partitioning, as compared to ground truth information.
Year
DOI
Venue
2005
10.1109/ACVMOT.2005.46
WACV/MOTION
Keywords
Field
DocType
better data,string patterns,state-of-the-art clustering combination technique,image data,string description,cluster combination technique,ensemble methods,median string,combination method,clustering algorithm,multiple data partition,clustering ensemble,k means algorithm,image processing,error correction,ground truth,error rate
Data mining,Canopy clustering algorithm,Fuzzy clustering,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Pattern recognition,Computer science,Artificial intelligence,Cluster analysis,Brown clustering,Single-linkage clustering
Conference
ISBN
Citations 
PageRank 
0-7695-2271-8-1
5
0.50
References 
Authors
10
2
Name
Order
Citations
PageRank
André Lourenço131245.33
Ana Fred221617.07