Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors.

Change log
Scott, Robert A 
Timpson, Nicholas J 
Davey Smith, George 
Thompson, Simon G 

Finding individual-level data for adequately-powered Mendelian randomization analyses may be problematic. As publicly-available summarized data on genetic associations with disease outcomes from large consortia are becoming more abundant, use of published data is an attractive analysis strategy for obtaining precise estimates of the causal effects of risk factors on outcomes. We detail the necessary steps for conducting Mendelian randomization investigations using published data, and present novel statistical methods for combining data on the associations of multiple (correlated or uncorrelated) genetic variants with the risk factor and outcome into a single causal effect estimate. A two-sample analysis strategy may be employed, in which evidence on the gene-risk factor and gene-outcome associations are taken from different data sources. These approaches allow the efficient identification of risk factors that are suitable targets for clinical intervention from published data, although the ability to assess the assumptions necessary for causal inference is diminished. Methods and guidance are illustrated using the example of the causal effect of serum calcium levels on fasting glucose concentrations. The estimated causal effect of a 1 standard deviation (0.13 mmol/L) increase in calcium levels on fasting glucose (mM) using a single lead variant from the CASR gene region is 0.044 (95 % credible interval -0.002, 0.100). In contrast, using our method to account for the correlation between variants, the corresponding estimate using 17 genetic variants is 0.022 (95 % credible interval 0.009, 0.035), a more clearly positive causal effect.

Data Interpretation, Statistical, Genetic Predisposition to Disease, Genetic Variation, Humans, Mendelian Randomization Analysis, Random Allocation, Risk Factors, Sensitivity and Specificity
Journal Title
Eur J Epidemiol
Conference Name
Journal ISSN
Volume Title
Springer Science and Business Media LLC
British Heart Foundation (None)
Medical Research Council (G0800270)
Medical Research Council (MC_UU_12015/1)
Medical Research Council (MR/L003120/1)
British Heart Foundation (CH/12/2/29428)
Wellcome Trust (100114/Z/12/Z)
Medical Research Council (MC_U106179471)
British Heart Foundation (None)
British Heart Foundation (None)
We thank all EPIC participants and staff for their contribution to the study. We thank staff from the Technical, Field Epidemiology and Data Functional Group Teams of the MRC Epidemiology Unit in Cambridge, UK, for carrying out sample preparation, DNA provision and quality control, genotyping and data-handling work. Funding for the biomarker measurements in the random subcohort was provided by grants to EPIC-InterAct from the European Community Framework Programme 6 (Integrated Project LSHM-CT-2006-037197) and to EPIC-Heart from the Medical Research Council and British Heart Foundation (Joint Award G0800270). Stephen Burgess is supported by the Wellcome Trust (Grant Number 100114). Simon G. Thompson is supported by the British Heart Foundation (Grant Number CH/12/2/29428). No specific funding was received for the writing of this manuscript.