Biclustering Sparse Binary Genomic Data. - Citegraph

Paper Info

Title
Biclustering Sparse Binary Genomic Data.

Abstract
Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.

Year	DOI	Venue
2008	10.1089/cmb.2008.0066	JOURNAL OF COMPUTATIONAL BIOLOGY
Keywords	Field	DocType
biclustering,binary data,transcription factor binding	Row,Row and column spaces,Matrix (mathematics),Computer science,Artificial intelligence,Binary data,Biclustering,Bioinformatics,TRANSFAC,Machine learning,Sparse matrix,Binary number	Journal
Volume	Issue	ISSN
15.0	10	1066-5277
Citations	PageRank	References
12	0.99	11
Authors
3

Authors (3 rows)

Cited by (12 rows)

References (11 rows)

Name	Order	Citations	PageRank
Miranda Van Uitert	1	103	6.97
Wouter Meuleman	2	69	2.71
Lodewyk F A Wessels	3	337	22.28

1