On Critical Care Data and Machine Learning Loss Function Landscapes

Cafolla, Conor Thomas

On Critical Care Data and Machine Learning Loss Function Landscapes

Repository URI

https://www.repository.cam.ac.uk/handle/1810/364440

Repository DOI

https://doi.org/10.17863/CAM.106132

Files

Thesis (21 MB)

Type

Thesis

Authors

Cafolla, Conor Thomas

Abstract

Intensive Care Units (ICUs) are constantly under strain, with many vulnerable people needing urgent critical care. Often doctors do not have the capacity to accommodate everyone, and so it is important that patients are discharged when it is safe to do so. However, discharging a patient at the wrong time may result in the patient being readmitted or simply not surviving; both of these cases are highly undesirable. Therefore, it is useful for there to be an understanding of what factors contribute most to the mortality rate of patients in intensive care units.

In the present work, machine learning models are trained to predict the mortality of a given patient with certain measurements. Neural networks with one hidden layer and two outcomes (alive or deceased) are used in these models, and local minima of the neural network loss functions are found using basin-hopping methods implemented with the open source software GMIN. In this way, the landscape of the loss function defined by the neural network model can be explored. Area Under the Curve (AUC) values were used to evaluate these models.

Two databases, MIMIC III and Amsterdam UMC db, are compared. There are many possible calculations to perform, and initially time-series for single variables and pairs of variables are used as inputs to the neural network. From MIMIC III, Glasgow Coma Scale (GCS) and Blood Urea Nitrogen (BUN) perform well, with AUCs just below 0.8 on their own, and an AUC above 0.8 together. From Amsterdam UMC db, Blood Pressure (BP) measurements perform well, with AUCs around 0.8. Generally the data from Amsterdam UMC db appears to outperform MIMIC III. The effect of using a model trained on one time window and evaluated on different time windows is also investigated, and we find that the AUC value decreases but not substantially for most clinical variables, suggesting the most recent data is the most useful for mortality prediction. There is a notable exception in Respiration Rate, where it is found that data from earlier times may actually provide more prognostic value than the most recent measurements. A permutational shuffling analysis is performed, which reveals patterns in the ways the data is organised, and sheds light on some innate properties of the data.

The data from the two ICU databases are then applied to another model, where inputs to the neural network are the worst values of a set of pre-chosen clinical variables, inspired by a score used elsewhere in the medical prognosis picture (APACHE II). The AUCs obtained in this way are generally better than for the time-series data above, with AUCs reaching just under 0.8 for MIMIC III and 0.85 for Amsterdam UMC db.

Synthetic spiral data is created to test some new machine learning methods, including an ensemble-like method where minima from the loss function landscape are combined in a process called Machine Learning Superposition (MLSUP). Minima are selected to maximise the diversity between them; pairs of minima are identified as suitable candidates for MLSUP by their misclassification index and their contributions to heat capacity peaks. We find that MLSUP outperforms a single neural network model for a larger neural network architectures, but is not as useful for a smaller ones. MLSUP is also applied to the ICU databases described above, however we find that there is little improvement in the AUCs obtained by the single neural networks.

The synthetic data is further used to explore two new landscapes: the landscape defined by a loss function designed to closely resemble the AUC function, and a landscape where narrower minima are penalised in energy (Sharpness Aware Minimisation, SAM). In both cases, AUCs from synthetic data and the real data are comparable to the more conventional ``cross entropy'' loss function, but do not offer much improvement. Coupled with the fact that these new loss functions have higher order complexity, and hence take longer to evaluate, it is concluded that these landscapes are not practically useful. However they still offer great insight into the nature of machine learning models and their landscapes.

This thesis concludes with an overview of the work completed, and some closing thoughts on the use of artificial intelligence (AI) in the healthcare setting, discussing how it should be used while adhering to some dangers it could present.

Date

2023-09-29

Advisors

Wales, David

Keywords

Data Science, Energy Landscapes, Intensive Care Unit, Machine Learning

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

Engineering and Physical Sciences Research Council (2275899)

Collections

Theses - Chemistry