A comparison of Cox and logistic regression for use in genome-wide association studies of cohort and case-cohort design
European Journal of Human Genetics
Nature Publishing Group
MetadataShow full item record
Staley, J., Jones, E., Kaptoge, S., Butterworth, A., Sweeting, M., Wood, A., & Howson, J. (2017). A comparison of Cox and logistic regression for use in genome-wide association studies of cohort and case-cohort design. European Journal of Human Genetics, 25 (7), 854-862. https://doi.org/10.1038/ejhg.2017.78
Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.
This work was supported by the UK Medical Research Council (G66840) and Pfizer (G73632). EPIC-CVD was funded by grants awarded to the University of Cambridge from the EU Framework Programme 7 (HEALTH-F2-2012- 279233), the UK Medical Research Council (G0800270), British Heart Foundation (SP/09/002) and the European Research Council (268834). EPIC InterAct project was funded by the EU FP6 programme (LSHM_CT_2006_037197) and is also supported by MC_UU_12015/1 and MC_UU_12015/5.
British Heart Foundation (RG/08/014/24067)
British Heart Foundation (RG/13/13/30194)
European Research Council (268834)
Embargo Lift Date
External DOI: https://doi.org/10.1038/ejhg.2017.78
This record's URL: https://www.repository.cam.ac.uk/handle/1810/265036
Attribution 4.0 International
Licence URL: http://creativecommons.org/licenses/by/4.0/