Advances in Meta-Learning, Robustness, and Second-Order Optimisation in Deep Learning

Oldewage, Elre

doi:https://doi.org/10.17863/CAM.108221

Advances in Meta-Learning, Robustness, and Second-Order Optimisation in Deep Learning

Repository URI

https://www.repository.cam.ac.uk/handle/1810/367751

Repository DOI

https://doi.org/10.17863/CAM.108221

Files

Primary Thesis (22.12 MB)

Type

Thesis

Authors

Oldewage, Elre

Abstract

In machine learning, we are concerned with developing algorithms that are able to learn, that is, to accumulate knowledge about how to do a task without having been programmed specifically for that purpose. In this thesis, we are concerned with learning from two different perspectives: domains to which we may apply efficient machine learners and ways in which we can improve learning by solving the underlying optimisation problem more efficiently.

Machine learning methods are typically very data hungry. Although modern machine learning has been hugely effective in solving real-world problems, these success stories are largely limited to settings where there is an enormous amount of domain-relevant data available. The field of meta-learning aims to develop models with improved sample efficiency by creating models that “learn how to learn”, i.e. models that can adapt rapidly to new tasks when presented with a relatively small amount of examples. In this thesis, we are concerned with amortised meta-learners, which perform task adaptation using hypernetworks to generate a task-adapted model. These learners are very cost efficient, requiring just a single forward pass through the hypernetwork to learn how to perform a new task. We show that these amortised meta-learners can be leveraged in novel ways that extend beyond their typical usage in the few-shot learning setting.

We develop a set-based poisoning attack against amortized meta-learners, which allows us to craft colluding sets of inputs that are tailored to fool the system’s learning algorithm when used as training data to adapt to new tasks (i.e. as a support set). Such jointly crafted adversarial inputs can collude to manipulate a classifier, and are especially easy to compute for amortised learners with differentiable adaptation mechanisms. We also employ amortised learners in the field of explainability to perform “dataset debugging”, where we develop a data valuation or sample importance strategy called Meta-LOO that can be used to detect noisy or out-of-distribution data; or to distill a set of examples down to its most useful elements.

From our second perspective, machine learning and optimisation are intimately linked; indeed, learning can be formulated as a minimisation problem of the training loss with respect to the model’s parameters — though in practice we also require our algorithms to generalise which is not a concern of optimisation more broadly. The chosen optimisation strategy affects the speed at which algorithms learn and the quality of solutions (i.e. model parameters) found. By studying optimisation, we may improve how well and how quickly our models are able to learn.

In this thesis we take a two-pronged approach towards this goal. First, we develop an online hypergradient-based hyperparameter optimisation strategy that improves state of the art by supporting a wide range of hyperparameters, while remaining tractable at scale. Notably, our method supports hyperparameters of the optimisation algorithm such as learning rates and momentum, which similar approaches in the literature do not. Second, we develop a second-order optimisation strategy which is applicable to the non-convex loss landscapes of deep learning. Our algorithm approximates a saddle-free version of the Hessian for which saddle points are repulsive rather than attractive, in a way that scales to deep learning problems.

Date

2023-03-31

Advisors

Turner, Richard

Keywords

adversarial attacks, deep learning, explainability, interpretability, meta-learning, neural networks, optimization, robustness, second-order optimisation

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Relationships

Is supplemented by:

https://doi.org/10.48550/arXiv.1708.07747
https://doi.org/10.1016/j.enbuild.2012.03.003
https://doi.org/10.5555/972470.972475
https://doi.org/10.1109/MSP.2012.2211477
https://doi.org/10.48550/arXiv.2104.03841
https://doi.org/10.48550/arXiv.1903.03096
https://doi.org/10.48550/arXiv.1606.04080
https://doi.org/10.1016/0031-3203(94)90145-7
https://doi.org/10.24432/C5002N
https://doi.org/10.24432/C5PK67
https://doi.org/10.24432/C5K31K
https://doi.org/10.24432/C5XG7R

Collections

Theses - Engineering

Advances in Meta-Learning, Robustness, and Second-Order Optimisation in Deep Learning

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Relationships

Collections