Title | ||
---|---|---|
A divide-and-conquer strategy to solve the out-of-memory problem of processing thousands of Affymetrix microarrays. |
Abstract | ||
---|---|---|
Out-of-memory problem was frequently encountered when processing thousands of CEL files using Bioconductor. We propose a divide-and-conquer strategy combined with randomised resampling to solve this problem. The CAMDA 2007 META-analysis data set which contains 5896 CEL files was used to test the approach on a typical commodity computer cluster by running established pre-processing algorithms for Affymetrix arrays in the Bioconductor package. The results were validated against a golden standard obtained by using a supercomputer. In addition to the performance improvement, the general divide-and-conquer strategy can be applied to any other normalisation algorithms without modifying the underlying implementation. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1504/IJCBDD.2008.022209 | Int. J. Comput. Biol. Drug Des. |
Keywords | Field | DocType |
divide-and-conquer,r/bioconductor,out-of-memory,affymetrix arrays,divide and conquer,computer clusters,microarray data,resampling,bioinformatics,meta analysis,microarrays | Data mining,Out of memory,Supercomputer,Affymetrix GeneChip Operating Software,Computer science,Bioconductor,Bioinformatics,Divide and conquer algorithms,Resampling,DNA microarray,Computer cluster | Journal |
Volume | Issue | ISSN |
1 | 4 | 1756-0756 |
Citations | PageRank | References |
0 | 0.34 | 3 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chia-Ju Lee | 1 | 0 | 0.34 |
Dong Fu | 2 | 0 | 0.34 |
Pan Du | 3 | 200 | 18.68 |
Hongmei Jiang | 4 | 9 | 2.74 |
Simon M. Lin | 5 | 366 | 37.72 |
Warren A Kibbe | 6 | 464 | 39.92 |