Repository logo

Proteomic Signatures of Type 2 Diabetes and Related Metabolic Traits



Change log


Carrasco Zanini Sánchez, Julia  ORCID logo


Advances in profiling technologies have enabled systematic assessment of a plethora of molecules from genome via proteome to a variety of metabolic phenotypes, providing unique opportunities to better understand disease mechanisms and identify molecular signatures that can help to predict, screen for and diagnose different diseases using agnostic discovery approaches. The proteome is at the centre of biological information transfer and therefore provides a unique source of actionable discoveries. The increasing global burden of Type 2 diabetes (T2D) calls for improved strategies to identify individuals at high-risk. However, the aetiology of this heterogenous disease remains incompletely understood, and some subgroups are poorly predicted or overlooked. The main aim of this thesis was to identify plasma proteomic signatures for improved risk prediction and aetiological understanding of T2D and related metabolic traits.

I first conducted a systematic characterisation of the major phenotypic and genetic determinants of plasma protein levels, which showed a range of modifiable risk factors as causal determinants that improved interpretation of disease biomarkers and candidate mediators. To assess the integrated whole-body response to high glucose ingestion, I systematically characterised protein changes during an oral glucose tolerance test (OGTT). Around 11% of the measured proteome changed after the OGTT, pointing to groups of proteins with a differential response in insulin resistant compared to insulin sensitive individuals, and that were associated with long term cardiometabolic consequences. To go beyond understanding of molecular mechanisms associated or involved in glucose homeostasis and explore the predictive potential of the plasma proteome, I applied a machine learning framework to ~5,000 proteins measured at fasting. I identified a 3 protein signature that improved discrimination over clinical risk factors of isolated impaired glucose tolerance (iIGT), a group at high risk for T2D and comorbidities that is missed by current screening guidelines. Further systematic assessment showed that a sparse signature of features across the proteome, metabolome and genome improved risk prediction for incident T2D, of which the major driver was the T2D-polygenic score. However, individuals with HbA1c below the threshold for prediabetes but at high polygenic risk had a substantially lower cumulative T2D incidence than people with prediabetes. To investigate the predictive potential of plasma proteomics more broadly and disease-specificity of predictive biomarkers, I developed sparse protein models for the prediction of 24 diverse incident diseases across a range of clinical specialties. As few as five proteins improved the predictive performance of patient derived risk factors for seven diseases, revealing the potential of high-throughput proteomics for novel biomarker discovery and improved risk prediction across a range of diseases.

This thesis demonstrates the potential of broad-capture plasma proteomics for identification of novel aetiological pathways and predictive biomarkers to improve or refine risk prediction and screening strategies, specifically for subgroups that are poorly captured by current clinical risk factors.





Wareham, Nicholas
Langenberg, Claudia
Pietzner, Maik


Biomarkers, Machine learning, Mendelian randomisation, Molecular epidemiology, Polygenic risk scores, Prediction, Proteomics, Type 2 diabetes


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
MRC (MC_UU_00006/1)
Medical Research Council (MC_UU_12015/1)
Wellcome Trust Cambridge Trust