Repository logo

From big data to personal narratives: a supervised learning framework for decoding the course of traumatic brain injury in intensive care



Change log


Bhattacharyay, Shubhayu  ORCID logo


The management of traumatic brain injury (TBI) in the intensive care unit (ICU) generates vast clinical data, much of which is never analysed or interpreted. At the same time, the dynamic, complex disease course of TBI is not sufficiently characterised for truly patient-tailored treatment. This thesis capitalises on an opportunity to widen the context of information considered by individualised, dynamic models of functional outcome and therapeutic intensity after TBI. This opportunity is jointly afforded by the large-scale data collection of the Collaborative European NeuroTrauma Effectiveness Research in TBI (CENTER-TBI) study and recent advances in machine learning (ML) for time series modelling.

This thesis combines a range of neural network (NN) architectures to propose a methodological framework by which all of the CENTER-TBI data collected before and during a patient's ICU stay can be dynamically mapped to ordinal endpoints. All of the CENTER-TBI variables are tokenised and embedded into lower-dimensional vectors which are then fed into gated recurrent neural networks (RNNs) to identify time-varying, informative patterns from the full dataset. The RNN outputs are then decoded by an ordinal output layer to return probability estimates, calibrated on validation sets, at each threshold of the endpoint. Regularised model weights are trained through supervised learning, and the reliability and information content of the modelling strategy are evaluated with repeated k-fold cross validation. Finally, the contribution of recorded clinical events to trained model outputs is estimated with a temporal extension of the SHapley Additive exPlanations (TimeSHAP) algorithm.

The first endpoint of my supervised learning framework is functional outcome at six months according to the Glasgow Outcome Scale – Extended (GOSE). For ordinal GOSE prediction, expanding the predictor set with the tokenisation-embedding encoder (i.e., making models ‘wider’) significantly improves prediction performance whilst adding hidden layers does not (i.e., making models ‘deeper’). Functional outcome prediction is more difficult at higher GOSE thresholds and for patients with longer ICU stays. The full set of CENTER-TBI variables accounts for approximately half of the ordinal variation in GOSE, and static (pre-ICU and admission) variables account for the vast majority of this prognostic information. Variables with the greatest contribution to prognosis include physician-based impressions, imaging features, protein biomarkers, and neurological assessments.

Then, I perform a clinimetric validation of the Therapy Intensity Level (TIL) scale and its five-category summary (TIL(Basic)) to measure the overall intensity of intracranial pressure (ICP) management. With TIL(Basic) as the second endpoint of the modelling framework, the full range of CENTER-TBI variables again explain approximately half of the ordinal variation in next-day transitions of ICP management after the second day of ICU stay. A patient's prior treatments, age, brain lesions, ICP, metabolic derangements, serial protein biomarkers, and neurological function are most predictive of future changes in TIL. However, a considerable proportion of these variations remain unaccounted for, suggesting the significant influence of a physician's preferences or unmeasured factors in contemporary ICP management.

Supervised ML with NN-based architectures proves useful for improving the detail of model inputs (over time) and outputs but requires thorough assessment of potential overfitting. Insights from this thesis can inform the design of dynamic causal inference models and future data-collection or informatics projects for TBI.





Ercole, Ari
Menon, David


traumatic brain injury, intensive care unit, intracranial pressure, therapy intensity level, machine learning, data mining, outcome prediction, dynamic prediction, ordinal prediction, word embedding, recurrent neural networks, Shapley values, time series analysis, clinimetrics, supervised learning


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
European Commission (602150)
MRC (4050062263 A853-0125)
Gates Cambridge Scholarship (2020-2024)
Is supplemented by: