Repository logo

Exploring Multivariate Gene-Environment Interactions: Models And Applications



Change log


Moore, Rachel 


Complex diseases are driven by multiple risk factors, including genetic variants, environmental exposures and interactions between the two. The advent of GWAS in 2005 and subsequent methodological advances have increased our knowledge of the genetic risk factors underpinning complex diseases. In addition, some research exploring genotype-environment interaction (G×E) effects has been conducted, \newline revealing that multiple environments are linked to interaction effects at a single locus for a given trait. However, correlation between these identified environments renders interpretation of the results difficult. This, together with the collation of large-scale biobanks that contain a multitude of phenotypic and environmental data (facilitating an increase in the number of G×E effects detected) has generated the need for methods that jointly account for G×E at multiple environments. Such methods may also increase the power to detect interaction effects by aggregating modest or weak G×E effects across environments and in addition enable additional phenotypic variance to be explained. Thus, the aim of this thesis is to provide suitable methods to identify variants subject to G×E effects, jointly accounting for multiple environmental exposures and explore these effects across a range of phenotypes using the UK Biobank data.

In Chapter 2, I describe the structured linear mixed model (StructLMM), a novel computationally efficient multivariate G×E framework. This model can be used to test for interaction or association effects. The latter accounts for possible \newline heterogeneity in variant effects across individuals due to differences in environmental exposures, thus enabling the detection of variant effects that might otherwise be masked due to the presence of interaction effects. I show through the use of simulation experiments that StructLMM is robustly calibrated and in general, better powered than existing interaction and association tests.

In Chapter 3, I present an application of StructLMM, where I identify significant interaction effects with 64 lifestyle-based factors for BMI using the UK Biobank data. In addition, I show that the StructLMM association test can be used to identify loci with genotype-environment contributions. Subsequently, I explore characteristics of loci with significant interaction effects, including the fraction of the genetic variance that is explained by G×E and the environmental profiles that increase or decrease phenotypic risk, using methods that are implemented as part of StructLMM.

In Chapter 4, I apply the StructLMM interaction test to multiple cardiometabolic traits using the UK Biobank data, facilitating exploration of shared G×E \newline architecture. Additionally, I provide preliminary estimates of the amount of \newline phenotypic variation that can be explained by G×E effects, relative to marginal association effects.

Taken together, the work in this thesis demonstrates the need and advantages of jointly modelling interaction effects at multiple environments, providing a new computationally efficient method to achieve this. Combined with the recent and ongoing generation of large biobanks, further research in this field has the potential to advance our understanding of complex traits and diseases.





Barroso, Inês
Stegle, Oliver


Genetics, Gene-environment interactions, Phenome-wide association study, Genome-wide association study, Cardiometabolic traits, StructLMM


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge