On the capacity and superposition of minima in neural network loss function landscapes
Publication Date
2022Journal Title
Machine Learning: Science and Technology
ISSN
2632-2153
Publisher
IOP Publishing
Volume
3
Issue
2
Language
en
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Niroomand, M., Morgan, J., Cafolla, C., & Wales, D. (2022). On the capacity and superposition of minima in neural network loss function landscapes. Machine Learning: Science and Technology, 3 (2) https://doi.org/10.1088/2632-2153/ac64e6
Abstract
<jats:title>Abstract</jats:title>
<jats:p>Minima of the loss function landscape (LFL) of a neural network are locally optimal sets of weights that extract and process information from the input data to make outcome predictions. In underparameterised networks, the capacity of the weights may be insufficient to fit all the relevant information. We demonstrate that different local minima specialise in certain aspects of the learning problem, and process the input information differently. This effect can be exploited using a meta-network in which the predictive power from multiple minima of the LFL is combined to produce a better classifier. With this approach, we can increase the area under the receiver operating characteristic curve by around <jats:inline-formula>
<jats:tex-math><?CDATA $20\%$?></jats:tex-math>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" overflow="scroll">
<mml:mn>20</mml:mn>
<mml:mi mathvariant="normal">%</mml:mi>
</mml:math>
<jats:inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="mlstac64e6ieqn1.gif" xlink:type="simple" />
</jats:inline-formula> for a complex learning problem. We propose a theoretical basis for combining minima and show how a meta-network can be trained to select the representative that is used for classification of a specific data item. Finally, we present an analysis of symmetry-equivalent solutions to machine learning problems, which provides a systematic means to improve the efficiency of this approach.</jats:p>
Keywords
Paper, Focus on Physics-Informed Machine Learning: Theory and Methods, ensemble learning, interpretability, loss function landscape, theoretical chemistry
Sponsorship
Agence Nationale de la Recherche (ANR-19-P3IA-0002)
Identifiers
mlstac64e6, ac64e6, mlst-100458.r2
External DOI: https://doi.org/10.1088/2632-2153/ac64e6
This record's URL: https://www.repository.cam.ac.uk/handle/1810/336261
Rights
Licence:
http://creativecommons.org/licenses/by/4.0
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk