Repository logo
 

Genetic feature engineering enables characterisation of shared risk factors in immune-mediated diseases.

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Type

Article

Change log

Authors

Burren, Oliver S 
Reales, Guillermo 
Wong, Limy 
Bowes, John 
Lee, James C 

Abstract

BACKGROUND: Genome-wide association studies (GWAS) have identified pervasive sharing of genetic architectures across multiple immune-mediated diseases (IMD). By learning the genetic basis of IMD risk from common diseases, this sharing can be exploited to enable analysis of less frequent IMD where, due to limited sample size, traditional GWAS techniques are challenging. METHODS: Exploiting ideas from Bayesian genetic fine-mapping, we developed a disease-focused shrinkage approach to allow us to distill genetic risk components from GWAS summary statistics for a set of related diseases. We applied this technique to 13 larger GWAS of common IMD, deriving a reduced dimension "basis" that summarised the multidimensional components of genetic risk. We used independent datasets including the UK Biobank to assess the performance of the basis and characterise individual axes. Finally, we projected summary GWAS data for smaller IMD studies, with less than 1000 cases, to assess whether the approach was able to provide additional insights into genetic architecture of less common IMD or IMD subtypes, where cohort collection is challenging. RESULTS: We identified 13 IMD genetic risk components. The projection of independent UK Biobank data demonstrated the IMD specificity and accuracy of the basis even for traits with very limited case-size (e.g. vitiligo, 150 cases). Projection of additional IMD-relevant studies allowed us to add biological interpretation to specific components, e.g. related to raised eosinophil counts in blood and serum concentration of the chemokine CXCL10 (IP-10). On application to 22 rare IMD and IMD subtypes, we were able to not only highlight subtype-discriminating axes (e.g. for juvenile idiopathic arthritis) but also suggest eight novel genetic associations. CONCLUSIONS: Requiring only summary-level data, our unsupervised approach allows the genetic architectures across any range of clinically related traits to be characterised in fewer dimensions. This facilitates the analysis of studies with modest sample size by matching shared axes of both genetic and biological risk across a wider disease domain, and provides an evidence base for possible therapeutic repurposing opportunities.

Description

Keywords

Bayes Theorem, Genetic Engineering, Genome-Wide Association Study, Humans, Immune System Diseases, Phenotype, Polymorphism, Single Nucleotide, Risk Factors, Sample Size

Journal Title

Genome Med

Conference Name

Journal ISSN

1756-994X
1756-994X

Volume Title

12

Publisher

Springer Science and Business Media LLC

Rights

All rights reserved
Sponsorship
Wellcome Trust (107881/Z/15/Z)
Medical Research Council (MC_UU_00002/4)
Wellcome Trust (083650/Z/07/Z)
Wellcome Trust (110303/Z/15/Z)
Wellcome Trust (105920/Z/14/Z)
Medical Research Council (MR/R013926/1)