Title
A divide-and-conquer strategy to solve the out-of-memory problem of processing thousands of Affymetrix microarrays.
Abstract
Out-of-memory problem was frequently encountered when processing thousands of CEL files using Bioconductor. We propose a divide-and-conquer strategy combined with randomised resampling to solve this problem. The CAMDA 2007 META-analysis data set which contains 5896 CEL files was used to test the approach on a typical commodity computer cluster by running established pre-processing algorithms for Affymetrix arrays in the Bioconductor package. The results were validated against a golden standard obtained by using a supercomputer. In addition to the performance improvement, the general divide-and-conquer strategy can be applied to any other normalisation algorithms without modifying the underlying implementation.
Year
DOI
Venue
2008
10.1504/IJCBDD.2008.022209
Int. J. Comput. Biol. Drug Des.
Keywords
Field
DocType
divide-and-conquer,r/bioconductor,out-of-memory,affymetrix arrays,divide and conquer,computer clusters,microarray data,resampling,bioinformatics,meta analysis,microarrays
Data mining,Out of memory,Supercomputer,Affymetrix GeneChip Operating Software,Computer science,Bioconductor,Bioinformatics,Divide and conquer algorithms,Resampling,DNA microarray,Computer cluster
Journal
Volume
Issue
ISSN
1
4
1756-0756
Citations 
PageRank 
References 
0
0.34
3
Authors
6
Name
Order
Citations
PageRank
Chia-Ju Lee100.34
Dong Fu200.34
Pan Du320018.68
Hongmei Jiang492.74
Simon M. Lin536637.72
Warren A Kibbe646439.92