Uncertainty in Neural Networks; Bayesian Ensembles, Priors & Prediction Intervals

Pearce, Tim

Uncertainty in Neural Networks; Bayesian Ensembles, Priors & Prediction Intervals

Repository URI

https://www.repository.cam.ac.uk/handle/1810/314918

Repository DOI

https://doi.org/10.17863/CAM.62024

Files

Thesis (7.13 MB)

Type

Thesis

Authors

Pearce, Tim

Abstract

The breakout success of deep neural networks (NNs) in the 2010's marked a new era in the quest to build artificial intelligence (AI). With NNs as the building block of these systems, excellent performance has been achieved on narrow, well-defined tasks where large amounts of data are available.

However, these systems lack certain capabilities that are important for broad use in real-world applications. One such capability is the communication of uncertainty in a NN's predictions and decisions. In applications such as healthcare recommendation or heavy machinery prognostics, it is vital that AI systems be aware of and express their uncertainty – this creates safer, more cautious, and ultimately more useful systems.

This thesis explores how to engineer NNs to communicate robust uncertainty estimates on their predictions, whilst minimising the impact on usability. One way to encourage uncertainty estimates to be robust is to adopt the Bayesian framework, which offers a principled approach to handling uncertainty. Two of the major contributions in this thesis relate to Bayesian NNs (BNNs).

Specifying appropriate priors is an important step in any Bayesian model, yet it is not clear how to do this in BNNs. The first contribution shows that the connection between BNNs and Gaussian Processes (GPs) provides an effective lens to study BNN priors. NN architectures are derived which mirror the combining of GP kernels to create priors tailored to a task.

The second major contribution is a novel way to perform approximate Bayesian inference in BNNs using a modified version of ensembling. Novel analysis improves an understanding of a technique known as randomised MAP sampling. It's shown this is particularly effective when strong correlations exist between parameters, making it well suited to NNs.

The third major contribution of the thesis is a non-Bayesian technique that trains a NN to directly output prediction intervals for regression tasks through a tailored objective function. This advances over related works that were incompatible with gradient descent, and ignored one source of uncertainty.

Date

2020-08-01

Advisors

Neely, Andy
Brintrup, Alexandra

Keywords

artificial intelligence, neural networks, deep learning, uncertainty

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

EPSRC (1741859)

EPSRC, Alan Turing Institute

Collections

Theses - Engineering