Repository logo
 

Improving Deep Learning with Probabilistic Approaches


Loading...
Thumbnail Image

Type

Change log

Abstract

Despite its successes in scaling to real-world problems, deep learning is not without flaws. In particular, it struggles with uncertainty quantification and data efficiency. Probabilistic methods, while currently somewhat underappreciated by the wider machine learning community, provide calibrated uncertainty estimates and tend to shine in the low-data regime. It would seem that probabilistic methods complement deep learning. Thus, we ask the question, ``How can probabilistic approaches be used to improve deep learning?''

On the topic of uncertainty estimation, we have three sets of contributions. Firstly, we show that probabilistic inference over the depth of a neural network not only side-steps challenges involved with scaling inference to the large weight spaces of modern neural networks but also provides well-calibrated uncertainty estimates and robust predictions. Furthermore, we leverage the uncertainty over depth for applications such as neural architecture search and active learning.

Secondly, we develop an alternative framework for dealing with these large weight spaces. We perform inference over a small subset of the weights in a neural network. We show that using crude approximations to select the subset of the weights is not very harmful compared to using them for inference over the weights. In particular, we find that capturing correlations between weights is essential for uncertainty estimation.

Thirdly, we investigate uncertainty estimation in sparse Mixture-of-Experts models. These models learn multiple diverse explanations of the data. We show that averaging these explanations results in robust predictions with well-calibrated uncertainty estimates. We provide an algorithm for doing so without incurring a heavy computational cost.

Finally, on the topic of data efficiency in deep generative models, we develop a generative model that learns which symmetry transformations are present in a dataset. This symmetry-aware generative model can be used to imbue standard deep generative models with inductive biases about the underlying generative process of the data. We experimentally show that this improves data efficiency.

Description

Date

2024-02-05

Advisors

Hernández-Lobato, José Miguel

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Engineering and Physical Sciences Research Council (2275525)