Title
Clustering Of Gene Expression Profiles: Creating Initialization-Independent Clusterings By Eliminating Unstable Genes
Abstract
Clustering is an important approach in the analysis of biological data, and often a first step to identify interesting patterns of coexpression in gene expression data. Because of the high complexity and diversity of gene expression data, many genes cannot be easily assigned to a cluster, but even if the dissimilarity of these genes with all other gene groups is large, they will finally be forced to become member of a cluster. In this paper we show how to detect such elements, called unstable elements. We have developed an approach for iterative clustering algorithms in which unstable elements are deleted, making the iterative algorithm less dependent on initial centers. Although the approach is unsupervised, it is less likely that the clusters into which the reduced data set is subdivided contain false positives. This clustering yields a more differentiated approach for biological data, since the cluster analysis is divided into two parts: the pruned data set is divided into highly consistent clusters in an unsupervised way and the removed, unstable elements for which no meaningful cluster exists in unsupervised terms can be given a cluster with the use of biological knowledge and information about the likelihood of cluster membership. We illustrate our framework on both an artificial and real biological data set.
Year
DOI
Venue
2010
10.2390/biecoll-jib-2010-134
JOURNAL OF INTEGRATIVE BIOINFORMATICS
Field
DocType
Volume
Biological data,Cluster (physics),Data mining,Complete-linkage clustering,Gene,Iterative method,Computer science,Bioinformatics,Initialization,Cluster analysis,False positive paradox
Journal
7
Issue
ISSN
Citations 
3
1613-4516
1
PageRank 
References 
Authors
0.36
7
3
Name
Order
Citations
PageRank
de mulder1101.17
Martin Kuiper252132.47
René Boel3565.58