simGWAS: a fast method for simulation of large scale case-control GWAS summary statistics
View / Open Files
Publication Date
2019-06-01Journal Title
Bioinformatics
ISSN
1367-4811
Publisher
Oxford University Press
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Fortune, M., & Wallace, C. (2019). simGWAS: a fast method for simulation of large scale case-control GWAS summary statistics. Bioinformatics https://doi.org/10.1093/bioinformatics/bty898
Abstract
Methods for analysis of GWAS summary statistics have encouraged data sharing and democratised the analysis of different diseases. Ideal validation for such methods is application to simulated data, where some "truth" is known. As GWAS increase in size, so does the computational complexity of such evaluations; standard practice repeatedly simulates and analyses genotype data for all individuals in an example study. We have developed a novel method based on an alternative approach, directly simulating GWAS summary data, without individual data as an intermediate step. We mathematically derive the expected statistics for any set of causal variants and their effect sizes, conditional upon control haplotype frequencies (available from public reference datasets). Simulation of GWAS summary output can be conducted independently of sample size by simulating random variates about these expected values. Across a range of scenarios, our method, available as an open source R package, produces very similar output to that from simulating individual genotypes with a substantial gain in speed even for modest sample sizes. Fast simulation of GWAS summary statistics will enable more complete and rapid evaluation of summary statistic methods as well as opening new potential avenues of research in fine mapping and gene set enrichment analysis.
Sponsorship
MF and CW are funded by the Wellcome Trust (WT099772, WT107881)
and CW by the MRC (MC_UU_00002/4). MF is currently funded by
Dementia Platforms UK.
Funder references
Wellcome Trust (107881/Z/15/Z)
Medical Research Council (MC_UU_00002/4)
Wellcome Trust (099772/Z/12/Z)
Identifiers
External DOI: https://doi.org/10.1093/bioinformatics/bty898
This record's URL: https://www.repository.cam.ac.uk/handle/1810/286603
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.