The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale's 18th problem

Colbrook, Matthew J; Antun, Vegard; Hansen, Anders C

The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale's 18th problem

Published version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/336157

Repository DOI

https://doi.org/10.17863/CAM.83582

Files

Published version (2.17 MB)

Type

Article

Authors

Colbrook, Matthew J

Antun, Vegard

Hansen, Anders C

Abstract

Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. For any positive integers $K >2$ and $L$ , there are cases where simultaneously: (a) no randomised training algorithm can compute a NN correct to $K$ digits with probability greater than $1 / 2$ , (b) there exists a deterministic training algorithm that computes a NN with $K −1$ correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic training algorithm that computes a NN with $K −2$ correct digits using no more than $L$ training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only $O (| \log ⁡(ϵ)|)$ layers are needed for an $ϵ$ -accurate solution to the inverse problem.

Keywords

stability and accuracy, AI and deep learning, inverse problems, Smale's 18th problem, solvability complexity index hierarchy

Journal Title

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Journal ISSN

0027-8424
1091-6490

Volume Title

119

Publisher

National Academy of Sciences

Publisher DOI

https://doi.org/10.1073/pnas.2107151119

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Sponsorship

Leverhulme Prize (n/a)
Royal Society (n/a)
Trinity College, University of Cambridge (n/a)

Collections

Jisc Publications Router