Exact natural gradient in deep linear networks and application to the nonlinear case
View / Open Files
Publication Date
2018Journal Title
32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada
Conference Name
Neural Information Processing Systems
ISSN
1049-5258
Publisher
NIPS
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Bernacchia, A., Lengyel, M., & Hennequin, G. (2018). Exact natural gradient in deep linear networks and application to the nonlinear case. 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada https://doi.org/10.17863/CAM.35433
Abstract
Stochastic gradient descent (SGD) remains the method of choice for deep learning, despite the limitations arising for ill-behaved objective functions. In cases where it could be estimated, the natural gradient has proven very effective at mitigating the catastrophic effects of pathological curvature in the objective function, but little is known theoretically about its convergence properties, and it has yet to find a practical implementation that would scale to very deep and large networks. Here, we derive an exact expression for the natural gradient in deep linear networks, which exhibit pathological curvature similar to the nonlinear case. We provide for the first time an analytical solution for its convergence rate, showing that the loss decreases exponentially to the global minimum in parameter space. Our expression for the natural gradient is surprisingly simple, computationally tractable, and explains why some approximations proposed previously work well in practice. This opens new avenues for approximating the natural gradient in the nonlinear case, and we show in preliminary experiments that our online natural gradient descent outperforms SGD on MNIST autoencoding while sharing its computational simplicity.
Sponsorship
This work was supported by Wellcome Trust Seed Award 202111/Z/16/Z (G.H.) and Wellcome Trust Investigator Award 095621/Z/11/Z (A.B.,M.L.).
Funder references
Wellcome Trust (202111/Z/16/Z)
Wellcome Trust (095621/Z/11/Z)
Identifiers
External DOI: https://doi.org/10.17863/CAM.35433
This record's URL: https://www.repository.cam.ac.uk/handle/1810/288118
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk