Repository logo
 

Deep learning as optimal control problems: Models and numerical methods

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Benning, M 
Celledoni, E 
Ehrhardt, MJ 
Owren, B 
Schönlieb, CB 

Abstract

We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.

Description

Keywords

math.OC, math.OC, cs.LG, cs.NA, math.NA

Journal Title

Journal of Computational Dynamics

Conference Name

Journal ISSN

2158-2505
2158-2505

Volume Title

6

Publisher

American Institute of Mathematical Sciences (AIMS)

Rights

All rights reserved
Sponsorship
Engineering and Physical Sciences Research Council (EP/H023348/1)
Engineering and Physical Sciences Research Council (EP/M00483X/1)
Engineering and Physical Sciences Research Council (EP/N014588/1)
European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (691070)
Alan Turing Institute (unknown)
European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (777826)
Leverhulme Trust (PLP-2017-275)