Repository logo
 

Advances in Probabilistic Modelling: Sparse Gaussian Processes, Autoencoders, and Few-shot Learning


Type

Thesis

Change log

Authors

Abstract

Learning is the ability to generalise beyond training examples; but because many generalisations are consistent with a given set of observations, all machine learning methods rely on inductive biases to select certain generalisations over others. This thesis explores how the model structure and priors affect the inductiven biases of probabilistic models, and our ability to learn and make inferences from data.

Specifically we present theoretical analyses alongside algorithmic and modelling advances in three areas of probabilistic machine learning: sparse Gaussian process approximations and invariant covariance functions, learning flexible priors for variational autoencoders, and probabilistic approaches for few-shot learning. As inference is rarely tractable, we discuss variational inference methods as a secondary theme.

First, we disentangle the theoretical properties and optimisation behaviour of two widely used sparse Gaussian process approximations. We conclude that a variational free energy approximation is more principled and extensible and should be used in practice despite potential optimisation difficulties. We then discuss how general symmetries and invariances can be integrated into Gaussian process priors and can be learned using the marginal likelihood. To make inference tractable, we develop a variational inference scheme that uses unbiased estimates of intractable covariance functions.

We then address the mismatch between aggregate posteriors and priors in variational autoencoders and propose a mechanism to define flexible distributions using a form of rejection sampling. We use this approach to define a more flexible prior distribution on the latent space of a variational autoencoder, which generalises to unseen test data and reduces the number of low quality samples from the model in a practical way.

Finally, we propose two probabilistic approaches to few-shot learning that achieve state of the art results on benchmarks, building on multi-task probabilistic models with adaptive classifier heads. Our first approach combines a pre-trained deep feature extractor with a simple probabilistic model for the head, and can be linked to automatically regularised softmax regression. The second employs an amortised head model; it can be viewed to meta-learn probabilistic inference for prediction, and can be generalised to other contexts such as few-shot regression.

Description

Date

2019-09-17

Advisors

Rasmussen, Carl Edward
Schölkopf, Bernhard

Keywords

Machine Learning, Probabilistic Modelling, Gaussian Processes, Variational Autoencoders, Few-shot Learning, Meta-Learning, Transfer Learning, Approximate Bayesian Inference, Variational Inference, Inductive Biases, Amortised Inference, Sparse Gaussian Processes, Invariance

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
UK Engineering and Physics Research Council (EPSRC) DTA, Qualcomm Studentship in Technology, Max Planck Society