Evaluating and enhancing cardiovascular disease risk prediction with algorithmic fairness

Coffey, Claire

doi:https://doi.org/10.17863/CAM.120789

Evaluating and enhancing cardiovascular disease risk prediction with algorithmic fairness

Repository URI

https://www.repository.cam.ac.uk/handle/1810/388397

Repository DOI

https://doi.org/10.17863/CAM.120789

Files

Primary Thesis (7.38 MB)

Type

Thesis

Authors

Coffey, Claire

Abstract

Cardiovascular disease (CVD) is the leading cause of morbidity and mortality worldwide, with risk prediction models in widespread clinical use. Yet much remains unknown about the performance of CVD risk prediction models in specific subgroups, and disparities in predictions can exacerbate health inequities. Algorithmic fairness, a research area in machine learning, provides a quantitative approach to assess and address inequities in prediction models. This thesis uses the principles of algorithmic fairness to evaluate CVD risk prediction models using UK Biobank data. Viewed through the lens of algorithmic fairness, the performance of existing clinical models (such as QRISK3, used in primary care in the UK) is quantified. These models display equitable performance across fairness metrics for deprivation groups. However, there are significant differences in model calibration around self-reported ethnicity groups, with notable under-prediction of risk for South Asians. To correct for this, a post-processing sex-and-ethnicity-specific recalibration method is proposed. My evaluations demonstrate how targeted recalibration can reduce disparities, enhancing the fairness and accuracy of CVD risk prediction in diverse subgroups. Beyond model-level fairness, a novel analysis investigates the fairness of individual risk factors, examining their susceptibility to fairness issues and differences between risk factors. A nearest neighbours matching approach is implemented to correct for confounding in the fairness assessment of individual risk factors across demographic groups. Sex-based disparities in cholesterol thresholds used for clinical decision-making are identified, raising concerns about their equity. The fairness of conventional risk factors is investigated alongside a novel risk factor proposed for clinical use - polygenic risk scores (PRS), demonstrating that PRS are not inherently less fair than conventional individual risk factors. The final results chapter introduces a two-stage method to integrate PRS into CVD risk stratification. The impact of PRS integration on both predictive performance and fairness is assessed, with findings suggesting that a PRS-integrated model can improve the detection of CVD events relative to QRISK3 without exacerbating unfairness for most population subgroups. The methodology and insights from this work provide an approach for evaluating and improving fairness in CVD and beyond, with broader implications for health equity, model development, and clinical translation. By bridging the fields of computer science and epidemiology, a pathway is illuminated towards more equitable health risk prediction models.

Date

2024-12-23

Advisors

Wood, Angela
Inouye, Michael
Lambert, Samuel

Keywords

Clinical risk prediction, Risk prediction, Algorithmic fairness, Health equity, Machine learning, Health data science, Polygenic risk scores, Cardiovascular disease

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Sponsorship

Health Data Research UK (PHD2020CAM003)

Relationships

Is supplemented by:

https://doi.org/10.1371/journal.pmed.1001779
https://doi.org/10.1038/s41588-021-00783-5

Collections

Theses - Public Health and Primary Care

Evaluating and enhancing cardiovascular disease risk prediction with algorithmic fairness

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Sponsorship

Relationships

Collections