Repository logo
 

Evaluating and enhancing cardiovascular disease risk prediction with algorithmic fairness


Loading...
Thumbnail Image

Type

Change log

Abstract

Cardiovascular disease (CVD) is the leading cause of morbidity and mortality worldwide, with risk prediction models in widespread clinical use. Yet much remains unknown about the performance of CVD risk prediction models in specific subgroups, and disparities in predictions can exacerbate health inequities. Algorithmic fairness, a research area in machine learning, provides a quantitative approach to assess and address inequities in prediction models. This thesis uses the principles of algorithmic fairness to evaluate CVD risk prediction models using UK Biobank data. Viewed through the lens of algorithmic fairness, the performance of existing clinical models (such as QRISK3, used in primary care in the UK) is quantified. These models display equitable performance across fairness metrics for deprivation groups. However, there are significant differences in model calibration around self-reported ethnicity groups, with notable under-prediction of risk for South Asians. To correct for this, a post-processing sex-and-ethnicity-specific recalibration method is proposed. My evaluations demonstrate how targeted recalibration can reduce disparities, enhancing the fairness and accuracy of CVD risk prediction in diverse subgroups. Beyond model-level fairness, a novel analysis investigates the fairness of individual risk factors, examining their susceptibility to fairness issues and differences between risk factors. A nearest neighbours matching approach is implemented to correct for confounding in the fairness assessment of individual risk factors across demographic groups. Sex-based disparities in cholesterol thresholds used for clinical decision-making are identified, raising concerns about their equity. The fairness of conventional risk factors is investigated alongside a novel risk factor proposed for clinical use - polygenic risk scores (PRS), demonstrating that PRS are not inherently less fair than conventional individual risk factors. The final results chapter introduces a two-stage method to integrate PRS into CVD risk stratification. The impact of PRS integration on both predictive performance and fairness is assessed, with findings suggesting that a PRS-integrated model can improve the detection of CVD events relative to QRISK3 without exacerbating unfairness for most population subgroups. The methodology and insights from this work provide an approach for evaluating and improving fairness in CVD and beyond, with broader implications for health equity, model development, and clinical translation. By bridging the fields of computer science and epidemiology, a pathway is illuminated towards more equitable health risk prediction models.

Description

Date

2024-12-23

Advisors

Wood, Angela
Inouye, Michael
Lambert, Samuel

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Health Data Research UK (PHD2020CAM003)