SCALABLE ONE-PASS OPTIMISATION OF HIGH-DIMENSIONAL WEIGHT-UPDATE HYPERPARAMETERS BY IMPLICIT DIFFERENTIATION

Clarke, RM; Oldewage, ET; Hernández-Lobato, JM

SCALABLE ONE-PASS OPTIMISATION OF HIGH-DIMENSIONAL WEIGHT-UPDATE HYPERPARAMETERS BY IMPLICIT DIFFERENTIATION

cam.depositDate	2022-04-07
cam.orpheus.counter	13
cam.orpheus.success	Mon Aug 29 08:26:29 BST 2022 - Embargo updated
dc.contributor.author	Clarke, RM
dc.contributor.author	Oldewage, ET
dc.contributor.author	Hernández-Lobato, JM
dc.contributor.orcid	Clarke, Ross [0000-0001-9884-046X]
dc.date.accessioned	2022-04-07T23:30:25Z
dc.date.available	2022-04-07T23:30:25Z
dc.date.updated	2022-04-07T12:21:50Z
dc.description.abstract	Machine learning training methods depend plentifully and intricately on hyperparameters, motivating automated strategies for their optimisation. Many existing algorithms restart training for each new hyperparameter choice, at considerable computational cost. Some hypergradient-based one-pass methods exist, but these either cannot be applied to arbitrary optimiser hyperparameters (such as learning rates and momenta) or take several times longer to train than their base models. We extend these existing methods to develop an approximate hypergradient-based hyperparameter optimiser which is applicable to any continuous hyperparameter appearing in a differentiable model weight update, yet requires only one training episode, with no restarts. We also provide a motivating argument for convergence to the true hypergradient, and perform tractable gradient-based optimisation of independent learning rates for each model parameter. Our method performs competitively from varied random hyperparameter initialisations on several UCI datasets and Fashion-MNIST (using a one-layer MLP), Penn Treebank (using an LSTM) and CIFAR-10 (using a ResNet-18), in time only 2-3x greater than vanilla training.
dc.identifier.doi	10.17863/CAM.83331
dc.identifier.uri	https://www.repository.cam.ac.uk/handle/1810/335897
dc.language.iso	eng
dc.publisher.department	Department of Engineering Student
dc.publisher.url	https://iclr.cc/virtual/2022/spotlight/6510
dc.rights	Publisher's own licence
dc.subject	cs.LG
dc.subject	cs.LG
dc.subject	stat.ML
dc.title	SCALABLE ONE-PASS OPTIMISATION OF HIGH-DIMENSIONAL WEIGHT-UPDATE HYPERPARAMETERS BY IMPLICIT DIFFERENTIATION
dc.type	Conference Object
dcterms.dateAccepted	2022-01-28
prism.publicationName	ICLR 2022 - 10th International Conference on Learning Representations
pubs.conference-finish-date	2022-04-29
pubs.conference-name	International Conference on Learning Representations 2022
pubs.conference-start-date	2022-04-25
pubs.funder-project-id	Engineering and Physical Sciences Research Council (2107369)
pubs.licence-display-name	Apollo Repository Deposit Licence Agreement
pubs.licence-identifier	apollo-deposit-licence-2-1
rioxxterms.version	AM
rioxxterms.versionofrecord	10.17863/CAM.83331

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2110.10461v2.pdf
Size:: 8.27 MB
Format:: Adobe Portable Document Format
Description:: Accepted version

Download

Collections

Cambridge University Research Outputs