Title
Optimal Set Cover Formulation for Exclusive Row Biclustering of Gene Expression.
Abstract
The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclustering problem (also known as projected clustering) for gene expression, in which each row can only be a member of a single bicluster while columns can participate in multiple clusters. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). We present a novel method to identify these exclusive row biclusters in the spirit of the optimal set cover problem. We present our algorithmic solution as a combination of existing biclustering algorithms and combinatorial auction techniques. Furthermore, we devise an approach for tuning the threshold of our algorithm based on comparison with a null model, inspired by the Gap statistic approach. We demonstrate our approach on both synthetic and real world gene expression data and show its power in identifying large span non-overlapping rows submatrices, while considering their unique nature.
Year
DOI
Venue
2014
10.1007/s11390-014-1440-y
J. Comput. Sci. Technol.
Keywords
Field
DocType
biclustering, exclusive row biclustering, projected clustering, gene expression
Row,Set cover problem,Data mining,Statistic,Combinatorial auction,Computer science,Null model,Biclustering,Cluster analysis,Block matrix
Journal
Volume
Issue
ISSN
29
3
1860-4749
Citations 
PageRank 
References 
1
0.35
17
Authors
4
Name
Order
Citations
PageRank
amichai110.35
painsky210.35
Saharon Rosset31087105.33
Saharon Rosset41087105.33