Clinical Presence: Impact on Predictive Modelling and Algorithmic Fairness
Repository URI
Repository DOI
Change log
Authors
Abstract
Hospitals routinely collect large amounts of data that may provide insights beyond what experimental studies can offer due to their practical, ethical, or budgetary limitations. However, these observational data present critical challenges: (i) the quantity and diversity of modalities make traditional statistical tools less amenable to model these data, and (ii) multiple factors influence what and when data are collected.
If not carefully considered, the observational process, referred to as clinical presence in this thesis, lessens the potential of observational data for predictive modelling. Particularly, clinical presence not only reflects medical expertise and patient deterioration but also socio-medical disparities deeply ingrained in healthcare practices. Failure to disentangle signal from bias in clinical presence risks perpetuating and amplifying socio-medical disparities.
In this thesis, we explore the understudied impact of clinical presence on predictive modelling and associated algorithmic fairness properties. We propose to use machine learning to tackle scalability and modelling flexibility while accounting for statistical biases and potential socio-medical disparities associated with clinical presence. By connecting methodologies from machine learning, biostatistics, and algorithmic fairness, we aim to provide insights into clinical presence and its impact on predictive models and algorithmic fairness.
In each chapter, we explore a different aspect of clinical presence and its impact on predictive modelling and fairness. First, we examine the challenges of missingness and how practitioners' use of imputation influences algorithmic fairness. Then, we aim to discover subgroups of patients under-served by current medical practices under non-random treatment assignment. Beyond observed covariates and treatment assignment, we show that the process associated with observed outcomes, particularly if preventing the observation of the outcome of interest, may impact groups with distinct risk profiles differently. Finally, we investigate the impact of the irregularities in longitudinal medical data and their impact on predictive models and their transportability.
Through this research, we aim to develop more equitable and accurate predictive models by addressing the complexities of clinical presence. By identifying and accounting for medical disparities deeply ingrained in medical history and, consequently, practices and data, we can ensure that the benefits of novel medical tools are equitably distributed to all.
Description
Date
Advisors
Tom, brian
Keywords
Qualification
Awarding Institution
Rights and licensing
Sponsorship
Alan Turing Institute (TUR-002003)

