Abstract | ||
---|---|---|
Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1007/11691730_9 | BioDM |
Keywords | Field | DocType |
novel heterogeneous clustering ensemble,different clustering algorithm,genetic algorithm,various clustering algorithm,optimal clustering result,single best clustering algorithm,clustering result,heterogeneous clustering ensemble method,robust clustering result,dissimilar clustering result,clustering algorithm,different cluster result,biological data,data mining,genetics | Hierarchical clustering,Data mining,Fuzzy clustering,Canopy clustering algorithm,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Computer science,Determining the number of clusters in a data set,Artificial intelligence,Cluster analysis,Machine learning | Conference |
Volume | ISSN | ISBN |
3916 | 0302-9743 | 3-540-33104-2 |
Citations | PageRank | References |
21 | 0.81 | 10 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hye-Sung Yoon | 1 | 34 | 2.11 |
Sunyoung Ahn | 2 | 21 | 1.49 |
Sang-Ho Lee | 3 | 77 | 7.76 |
Sung-bum Cho | 4 | 63 | 3.02 |
Ju Han Kim | 5 | 248 | 30.80 |