Title | ||
---|---|---|
Supercomputing enabling exhaustive statistical analysis of genome wide association study data: Preliminary results. |
Abstract | ||
---|---|---|
Most published GWAS do not examine SNP interactions due to the high computational complexity of computing p-values for the interaction terms. Our aim is to utilize supercomputing resources to apply complex statistical techniques to the world's accumulating GWAS, epidemiology, survival and pathology data to uncover more information about genetic and environmental risk, biology and aetiology. We performed the Bayesian Posterior Probability test on a pseudo data set with 500,000 single nucleotide polymorphism and 100 samples as proof of principle. We carried out strong scaling simulations on 2 to 4,096 processing cores with factor 2 increments in partition size. On two processing cores, the run time is 317h, i.e. almost two weeks, compared to less than 10 minutes on 4,096 processing cores. The speedup factor is 2,020 that is very close to the theoretical value of 2,048. This work demonstrates the feasibility of performing exhaustive higher order analysis of GWAS studies using independence testing for contingency tables. We are now in a position to employ supercomputers with hundreds of thousands of threads for higher order analysis of GWAS data using complex statistics. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/EMBC.2012.6346166 | EMBC |
Keywords | Field | DocType |
aetiology,epidemiology,independence testing,contingency tables,statistical analysis,genomics,pathology data,supercomputing resources,single nucleotide polymorphism,bayesian posterior probability test,gwas,genome wide association study data,biology,biology computing,genetic risk,bayes methods,environmental risk,distributed processing,survival data | Data mining,Supercomputer,Computer science,Posterior probability,Genomics,Proof of concept,Contingency table,Artificial intelligence,Machine learning,Speedup,Computational complexity theory,Bayesian probability | Conference |
Volume | ISSN | ISBN |
2012 | 1557-170X | 978-1-4577-1787-1 |
Citations | PageRank | References |
2 | 0.39 | 2 |
Authors | ||
14 |
Name | Order | Citations | PageRank |
---|---|---|---|
Matthias Reumann | 1 | 27 | 5.04 |
Enes Makalic | 2 | 9 | 0.96 |
Benjamin W Goudey | 3 | 2 | 1.41 |
Michael Inouye | 4 | 36 | 5.91 |
Adrian Bickerstaffe | 5 | 20 | 1.83 |
Minh Bui | 6 | 2 | 0.73 |
Daniel J. Park | 7 | 16 | 2.61 |
Miroslaw K Kapuscinski | 8 | 2 | 0.39 |
Daniel F. Schmidt | 9 | 51 | 10.68 |
Zeyu Zhou | 10 | 3 | 1.08 |
Guoqi Qian | 11 | 20 | 6.98 |
Justin Zobel | 12 | 6882 | 880.46 |
John Wagner | 13 | 2 | 0.39 |
John L. Hopper | 14 | 2 | 1.75 |