Relaxing assumptions in deep probabilistic modelling

Change log
Deasy, Jacob 

The current generation of deep neural network-based models demonstrate tremendous capacity to learn distributions at scale. Given this success, deep learning and deep generative modelling have progressively been applied across a broader range of increasingly demanding applications, as well as in safety-critical domains such as healthcare. However, existing models are reliant upon restrictive theoretical assumptions, deriving from longstanding distributions and divergences at their core, which inhibit their continued advance. By leveraging wider distribution and divergence families, transferring broader parametric assumptions to deep generative models increases the scope of the functions they can approximate. In particular, Kullback-Leibler divergence and the Gaussian distribution are assumed at the heart of variational autoencoders and score-based models, and are central to their limitations. This thesis argues that both assumptions can be viewed through wider lenses—the skew-geometric Jensen-Shannon divergence family and the generalised normal distribution family respectively.

Several contributions are made to both the theory of deep learning, specifically deep generative modelling, and its application to electronic health records (EHRs). Firstly, a new type of variational autoencoder is introduced, capitalising on the flexibility of the skew-geometric Jensen-Shannon divergence, to overcome the prior theoretical shortcomings and lack of interpretability of latent space constraints. JSGα-VAEs lead to better reconstruction and generation when compared to baseline VAEs and utilise a single hyperparameter which can be easily interpreted in latent space. Secondly, heavy-tailed denoising score matching (HTDSM) is proposed, motivated by superior concentration of measure for the noising distribution in high-dimensional space. HTDSM offers improved score estimation, controllable sampling convergence, and more class-balanced unconditional generative performance. Finally, several results which indicate that the generalisation of EHR pipelines and models leads to increased flexibility with greater utility in the clinic are presented. Specifically, restrictions on data collation, curation, and chronology are removed while maintaining competitive performance on clinical objectives such as mortality prediction.

Lio, Pietro
Ercole, Ari
deep learning, machine learning, statistics, VAEs, SBMs
Doctor of Philosophy (PhD)
Awarding Institution
University of Cambridge