Approximate Inference in Bayesian Neural Networks and Translation Equivariant Neural Processes

Foong, Yue Kwang

doi:10.17863/CAM.91712

Approximate Inference in Bayesian Neural Networks and Translation Equivariant Neural Processes

Repository URI

https://www.repository.cam.ac.uk/handle/1810/344287

Repository DOI

https://doi.org/10.17863/CAM.91712

Files

Thesis (15.35 MB)

Type

Thesis

Authors

Foong, Yue Kwang

Abstract

It has been a longstanding goal in machine learning to develop flexible prediction methods that ‘know what they don’t know’ — when faced with an out-of-distribution input, these models should signal their uncertainty rather than be confidently wrong. This thesis is concerned with two such probabilistic machine learning models: Bayesian neural networks and neural processes. Bayesian neural networks are a classical model that has been the subject of research since the 1990s. They rely on Bayesian inference to represent uncertainty in the weights of a neural network. On the other hand, neural processes are a recently introduced model that relies on meta-learning rather than Bayesian inference to obtain uncertainty estimates.

This thesis provides contributions to both of these research areas. For Bayesian neural networks, we provide a theoretical and empirical study of the quality of common variational methods in approximating the Bayesian predictive distribution. We show that for single-hidden layer networks with ReLU activation functions, there are fundamental limitations concerning the representation of in-between uncertainty: increased uncertainty in between well separated regions of low uncertainty. We show that this theoretical limitation doesn’t apply for deeper networks. However, in practice, in-between uncertainty is a feature of the exact predictive distribution that is still often lost by approximate inference, even with deep networks.

In the second part of this thesis, we focus on neural processes. In contrast to Bayesian neural networks, neural processes do not rely on approximate inference. Instead, they use neural networks to directly parameterise the map from a dataset to the posterior predictive stochastic process conditioned on that dataset. In this thesis we introduce the convolutional neural process, a new kind of neural process architecture which incorporates translation equivariance into its predictions. We show that when this symmetry is an appropriate assumption, convolutional neural processes outperform their standard multilayer perceptron-based and attentive counterparts on a variety of regression benchmarks.

Date

2022-09-21

Advisors

Turner, Richard

Keywords

deep learning, Bayesian inference, machine learning, meta-learning

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Collections

Theses - Engineering

Approximate Inference in Bayesian Neural Networks and Translation Equivariant Neural Processes

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Collections