Advances in Probabilistic Meta-Learning and the Neural Process Family

Gordon, Jonathan

Advances in Probabilistic Meta-Learning and the Neural Process Family

Repository URI

https://www.repository.cam.ac.uk/handle/1810/325374

Repository DOI

https://doi.org/10.17863/CAM.72831

Files

Thesis (11.21 MB)

Type

Thesis

Authors

Gordon, Jonathan

Abstract

A natural progression in machine learning research is to automate and learn from data increasingly many components of our learning agents.Meta-learning is a paradigm that fully embraces this perspective, and can be intuitively described as embodying the idea of learning to learn. A goal of meta-learning research is the development of models to assist users in navigating the intricate space of design choices associated with specifying machine learning solutions. This space is particularly formidable when considering deep learning approaches, which involve myriad design choices interacting in complex fashions to affect the performance of the resulting agents. Despite the impressive successes of deep learning in recent years, this challenge remains a significant bottleneck in deploying neural network based solutions in several important application domains. But how can we reason about and design solutions to this daunting task?

This thesis is concerned with a particular perspective for meta-learning in supervised settings. We view supervised learning algorithms as mappings that take data sets to predictive models, and consider meta-learning as learning to approximate functions of this form. In particular, we are interested in meta-learners that (i) employ neural networks to approximate these functions in an end-to-end manner, and (ii) provide predictive distributions rather than single predictors. The former is motivated by the success of neural networks as function approximators, and the latter by our interest in the few-shot learning scenario. The introductory chapters of this thesis formalise this notion, and use it to provide a tutorial introducing the Neural Process Family (NPF), a class of models introduced by Garnelo et al (2018) satisfying the above-mentioned modelling desiderata. We then present our own technical contributions to the NPF.

First, we focus on fundamental properties of the model-class, such as expressivity and limiting behaviours of the associated training procedures. Next, we study the role of translation equivariance in the NPF. Considering the intimate relationship between the NPF and the representation of functions operating on sets, we extend the underlying theory of DeepSets to include translation equivariance. We then develop novel members of the NPF endowed with this important inductive bias. Through extensive empirical evaluation, we demonstrate that, in many settings, they significantly outperform their non-equivariant counterparts.

Finally, we turn our attention to the development of Neural Processes for few-shot image-classification. We introduce models that navigate the important tradeoffs associated with this setting, and describe the specification of their central components. We demonstrate that the resulting models---CNAPs---achieve state-of-the-art performance on a challenging benchmark called Meta-Dataset, while adapting faster and with less computational overhead than their best-performing competitors.

Date

2021-07-14

Advisors

Hernandez-Lobato, Jose Miguel

Keywords

machine learning, meta-learning

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Collections

Theses - Engineering