High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.
View / Open Files
Authors
Goudey, Benjamin
Abedini, Mani
Hopper, John L
Makalic, Enes
Schmidt, Daniel F
Wagner, John
Zhou, Zeyu
Zobel, Justin
Reumann, Matthias
Publication Date
2015Journal Title
Health Inf Sci Syst
ISSN
2047-2501
Publisher
Springer Science and Business Media LLC
Volume
3
Issue
Suppl 1 HISA Big Data in Biomedicine and Healthcare 201
Pages
S3
Language
eng
Type
Article
Physical Medium
Electronic-eCollection
Metadata
Show full item recordCitation
Goudey, B., Abedini, M., Hopper, J. L., Inouye, M., Makalic, E., Schmidt, D. F., Wagner, J., et al. (2015). High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.. Health Inf Sci Syst, 3 (Suppl 1 HISA Big Data in Biomedicine and Healthcare 201), S3. https://doi.org/10.1186/2047-2501-3-S1-S3
Abstract
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Keywords
Human Genome, Genetics
Sponsorship
This research was partially funded by NHMRC grant 1033452 and was supported by a Victorian Life Sciences Computation Initiative (VLSCI) grant number 0126 on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government, Australia.
Identifiers
External DOI: https://doi.org/10.1186/2047-2501-3-S1-S3
This record's URL: https://www.repository.cam.ac.uk/handle/1810/279879
Rights
Attribution 4.0 International (CC BY 4.0)
Licence URL: https://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk