Deep convolutional networks as shallow Gaussian processes

Garriga-Alonso, A; Aitchison, L; Rasmussen, CE

Deep convolutional networks as shallow Gaussian processes

Not Applicable (or Unknown)

Repository URI

https://www.repository.cam.ac.uk/handle/1810/295286

Repository DOI

https://doi.org/10.17863/CAM.42340

Files

Submitted version (406.87 KB)

Type

Article

Authors

Garriga-Alonso, A

Aitchison, L

Rasmussen, CE

Abstract

We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters, extending similar results for dense networks. For a CNN, the equivalent kernel can be computed exactly and, unlike "deep kernels", has very few parameters: only the hyperparameters of the original CNN. Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer. The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST, a new record for GPs with a comparable number of parameters.

Keywords

stat.ML, stat.ML, cs.LG

Journal Title

7th International Conference on Learning Representations, ICLR 2019

Rights

Collections

Cambridge University Research Outputs