Repository logo
 

On the capacity and superposition of minima in neural network loss function landscapes

cam.issuedOnline2022-04-20
dc.contributor.authorNiroomand, MP
dc.contributor.authorMorgan, JWR
dc.contributor.authorCafolla, CT
dc.contributor.authorWales, DJ
dc.contributor.orcidNiroomand, MP [0000-0002-7189-0456]
dc.contributor.orcidCafolla, CT [0000-0003-2021-974X]
dc.contributor.orcidWales, DJ [0000-0002-3555-6645]
dc.date.accessioned2022-04-20T14:00:10Z
dc.date.available2022-04-20T14:00:10Z
dc.date.issued2022
dc.date.submitted2021-11-10
dc.date.updated2022-04-20T14:00:10Z
dc.description.abstract<jats:title>Abstract</jats:title> <jats:p>Minima of the loss function landscape (LFL) of a neural network are locally optimal sets of weights that extract and process information from the input data to make outcome predictions. In underparameterised networks, the capacity of the weights may be insufficient to fit all the relevant information. We demonstrate that different local minima specialise in certain aspects of the learning problem, and process the input information differently. This effect can be exploited using a meta-network in which the predictive power from multiple minima of the LFL is combined to produce a better classifier. With this approach, we can increase the area under the receiver operating characteristic curve by around <jats:inline-formula> <jats:tex-math><?CDATA $20\%$?></jats:tex-math> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" overflow="scroll"> <mml:mn>20</mml:mn> <mml:mi mathvariant="normal">%</mml:mi> </mml:math> <jats:inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="mlstac64e6ieqn1.gif" xlink:type="simple" /> </jats:inline-formula> for a complex learning problem. We propose a theoretical basis for combining minima and show how a meta-network can be trained to select the representative that is used for classification of a specific data item. Finally, we present an analysis of symmetry-equivalent solutions to machine learning problems, which provides a systematic means to improve the efficiency of this approach.</jats:p>
dc.identifier.doi10.17863/CAM.83680
dc.identifier.eissn2632-2153
dc.identifier.issn2632-2153
dc.identifier.othermlstac64e6
dc.identifier.otherac64e6
dc.identifier.othermlst-100458.r2
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/336261
dc.languageen
dc.language.isoeng
dc.publisherIOP Publishing
dc.publisher.urlhttp://dx.doi.org/10.1088/2632-2153/ac64e6
dc.subjectensemble learning
dc.subjectinterpretability
dc.subjectloss function landscape
dc.subjecttheoretical chemistry
dc.titleOn the capacity and superposition of minima in neural network loss function landscapes
dc.typeArticle
dcterms.dateAccepted2022-04-06
prism.issueIdentifier2
prism.publicationNameMachine Learning: Science and Technology
prism.volume3
pubs.funder-project-idAgence Nationale de la Recherche (ANR-19-P3IA-0002)
rioxxterms.licenseref.urihttp://creativecommons.org/licenses/by/4.0
rioxxterms.versionVoR
rioxxterms.versionofrecord10.1088/2632-2153/ac64e6

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
metadata.xml
Size:
8.33 KB
Format:
Extensible Markup Language
Description:
Bibliographic metadata
Licence
http://creativecommons.org/licenses/by/4.0
Loading...
Thumbnail Image
Name:
pdf.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
Description:
Published version
Licence
http://creativecommons.org/licenses/by/4.0