Show simple item record

dc.contributor.authorDriggs, Derek
dc.contributor.authorEhrhardt, MJ
dc.contributor.authorSchönlieb, CB
dc.date.accessioned2022-02-23T17:01:15Z
dc.date.available2022-02-23T17:01:15Z
dc.date.issued2022-02
dc.date.submitted2019-10-08
dc.identifier.issn0025-5610
dc.identifier.others10107-020-01566-2
dc.identifier.other1566
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/334374
dc.descriptionFunder: Gates Cambridge Trust (GB)
dc.description.abstract<jats:title>Abstract</jats:title><jats:p>Variance reduction is a crucial tool for improving the slow convergence of stochastic gradient descent. Only a few variance-reduced methods, however, have yet been shown to directly benefit from Nesterov’s acceleration techniques to match the convergence rates of accelerated gradient methods. Such approaches rely on “negative momentum”, a technique for further variance reduction that is generally specific to the SVRG gradient estimator. In this work, we show for the first time that negative momentum is unnecessary for acceleration and develop a universal acceleration framework that allows all popular variance-reduced methods to achieve accelerated convergence rates. The constants appearing in these rates, including their dependence on the number of functions <jats:italic>n</jats:italic>, scale with the mean-squared-error and bias of the gradient estimator. In a series of numerical experiments, we demonstrate that versions of SAGA, SVRG, SARAH, and SARGE using our framework significantly outperform non-accelerated versions and compare favourably with algorithms using negative momentum.</jats:p>
dc.languageen
dc.publisherSpringer Science and Business Media LLC
dc.subjectFull Length Paper
dc.subjectStochastic optimisation
dc.subjectConvex optimisation
dc.subjectVariance reduction
dc.subjectAccelerated gradient descent
dc.subject90C06
dc.subject90C15
dc.subject90C25
dc.subject90C30
dc.subject90C60
dc.subject68Q25
dc.titleAccelerating variance-reduced stochastic gradient methods
dc.typeArticle
dc.date.updated2022-02-23T17:01:15Z
prism.endingPage715
prism.issueIdentifier2
prism.publicationNameMathematical Programming
prism.startingPage671
prism.volume191
dc.identifier.doi10.17863/CAM.81790
dcterms.dateAccepted2020-09-07
rioxxterms.versionofrecord10.1007/s10107-020-01566-2
rioxxterms.versionVoR
rioxxterms.licenseref.urihttp://creativecommons.org/licenses/by/4.0/
dc.contributor.orcidDriggs, Derek [0000-0003-1582-5884]
dc.identifier.eissn1436-4646
pubs.funder-project-idEngineering and Physical Sciences Research Council (EP/M00483X/1)
pubs.funder-project-idEngineering and Physical Sciences Research Council (EP/N014588/1)
pubs.funder-project-idEuropean Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (691070)
pubs.funder-project-idEuropean Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (777826)
pubs.funder-project-idEPSRC (EP/S026045/1)
pubs.funder-project-idEngineering and Physical Sciences Research Council (EP/H023348/1)
pubs.funder-project-idLeverhulme Trust (PLP-2017-275)
pubs.funder-project-idAlan Turing Institute (Unknown)
cam.issuedOnline2020-09-15


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record