Title
Hierarchical parallelization of gene differential association analysis.
Abstract
Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today.Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm.The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.
Year
DOI
Venue
2011
10.1186/1471-2105-12-374
BMC Bioinformatics
Keywords
Field
DocType
bioinformatics,programming languages,microarrays,cluster analysis,algorithms
Clustering high-dimensional data,Shared memory,Computer science,Parallel computing,Parallel processing,Permutation,Differential association,Theoretical computer science,Software,Bioinformatics,Granularity,DNA microarray
Journal
Volume
Issue
ISSN
12
1
1471-2105
Citations 
PageRank 
References 
8
0.34
9
Authors
4
Name
Order
Citations
PageRank
Mark Needham180.34
Rui Hu2783.47
Sandhya Dwarkadas33504257.31
Xing Qiu419312.55