Repository logo

Deep forward and reverse phenotyping for genetic discovery in pulmonary arterial hypertension.



Change log


Swietlik, Emilia Maria 


Pulmonary arterial hypertension (PAH) is a rare disease characterised by constriction and obliteration of small pulmonary arteries, which leads to increased pulmonary vascular resistance and in consequence, right ventricular failure and death. The accurate clinical diagnosis of pulmonary hypertension became possible in the 1950s due to the invention of right heart catheterisation, but it was not until 2000 when the landmark discovery of the causative role of bone morphogenetic protein receptor type 2 (BMPR2) mutations shed new light on the pathogenesis of PAH. Since then, several genes have been discovered, which now account for around 25% of cases with the clinical diagnosis of idiopathic PAH (IPAH). Despite the ongoing efforts, for the majority of patients, the cause of the disease remains elusive, a phenomenon often referred to as “missing heritability”. The objective of my PhD project was to expand our understanding of the genetic architecture of PAH by studying a large international cohort of deeply phenotyped patients with idiopathic and heritable PAH who had been whole-genome sequenced. The approach I have used in this thesis differs from previously published studies. Rather than relying solely on case labels in keeping with the diagnostic classification of PAH, I clustered patients into homogeneous groups based on deep phenotype information in the hope that such groups would be enriched for rare deleterious variants. I took a two-fold approach. Firstly, using domain knowledge, I stratified the patients based on age, acute nitric oxide challenge response, transfer factor for carbon monoxide (KCO), the presence of small cardiac defects and poor- or super- survivor status. Secondly, to account for complex phenotypes, I annotated patients with Human Phenotype Ontology terms and devised computational clusters based on ontological similarity. Once patients were grouped into new phenotypic clusters, I deployed a Bayesian model comparison method, BeviMed, for case-control analysis. The BeviMed analysis identified 59 significant gene-tag associations with posterior probability (PP) above 0.75 (when prior set to 0.001), including three associations with a new risk gene, Kinase insert domain (KDR), encoding vascular endothelial growth factor receptor 2 (VGFR2), and a strong association with a new candidate gene, COL6A5. While BMPR2, TBX4, EIF2AK4, ACVRL1 and AQP1 showed the highest association (PP≥0.99), I also confirmed significant associations in the majority of other previously identified genes. High impact variants in the KDR were associated with a significantly reduced KCO (KCO lower tertile, log(BF) = 11.362, PP = 0.989), older age at diagnosis (tag: old age, log(BF) = 9.249, PP = 0.912) and PAM 2 (Lin) cluster (tag: PAM2, log(BF) = 8.048, PP = 0.758) under the autosomal dominant mode of inheritance. Moderate and high impact variants in COL6A5 were strongly associated with divisive hierarchical clustering cluster 2 (tag: Divisive HC 2, log(BF) = 10.627, PP = 0.976) under the autosomal dominant mode of inheritance. Computationally derived phenotypes led to the discovery of three additional new candidate genes (OR6T1, EYS, HPSE2), albeit at lower significance levels. The refined phenotype approach corroborated many previously reported genotype-phenotype associations. Individuals with rare variants in BMPR2, TBX4 (high impact), EIF2AK4 (biallelic) and SOX17 had a significantly younger age of disease onset (tag: young age). Patients with familial pulmonary arterial hypertension were harbouring deleterious variants in AQP1 (log(BF) = 10.023, PP = 0.958). Additionally, high impact variants in BMPR2 were associated with preserved KCO (tag: KCO higher tertile, log(BF) = 99.923, PP = 1), while biallelic EIF2AK4 mutations showed association with reduced KCO (tag: KCO< 50% predicted, log(BF) = 29.741, PP = 1). Finally, I characterised patients harbouring mutations in two new pertinent genes, namely KDR and GDF2. I found that patients harbouring protein-truncating variants in KDR showed mild fibrotic changes on high resolution computed tomography, which were further characterised as patchy bronchocentric fibrosis on wedge biopsy from one patient. GDF2 mutation carriers were similar to PAH patients without mutations and showed no features of hereditary haemorrhagic telangiectasia (HHT) or vascular anomaly syndromes. GDF2 mutation carriers were significantly older and had less severe haemodynamics than BMPR2 mutation carriers, and did not constitute a distinct cluster. Additionally, I found a significant correlation between the prodomain-bound form of the bone morphogenic protein 10 (pBMP10) levels and the risk of developing systemic hypertension among patients with IPAH. In summary, deep clinical and computational phenotyping allowed me to devise homogenous groups of patients which were enriched for rare deleterious variants in known and new PAH risk genes.





Morrell, Nicholas
Graf, Stefan


forward phenotyping, deep phenotyping, computational phenotyping, clinical phenotyping, forward genetics, reverse genetics, reverse phenotyping, KDR, VEGFR2, COL6A5, GDF2, KCO, pulmonary arterial hypertension, BMPR2


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
British Heart Foundation (None)