Geometry and Uncertainty in Deep Learning for Computer Vision

Kendall, Alex Guy

doi:10.17863/CAM.35260

Geometry and Uncertainty in Deep Learning for Computer Vision

Repository URI

https://www.repository.cam.ac.uk/handle/1810/287944

Repository DOI

https://doi.org/10.17863/CAM.35260

Files

Thesis (77.57 MB)

Type

Thesis

Authors

Kendall, Alex Guy

https://orcid.org/0000-0003-1904-5885

Abstract

Deep learning and convolutional neural networks have become the dominant tool for computer vision. These techniques excel at learning complicated representations from data using supervised learning. In particular, image recognition models now out-perform human baselines under constrained settings. However, the science of computer vision aims to build machines which can see. This requires models which can extract richer information than recognition, from images and video. In general, applying these deep learning models from recognition to other problems in computer vision is significantly more challenging.

This thesis presents end-to-end deep learning architectures for a number of core computer vision problems; scene understanding, camera pose estimation, stereo vision and video semantic segmentation. Our models outperform traditional approaches and advance state-of-the-art on a number of challenging computer vision benchmarks. However, these end-to-end models are often not interpretable and require enormous quantities of training data.

To address this, we make two observations: (i) we do not need to learn everything from scratch, we know a lot about the physical world, and (ii) we cannot know everything from data, our models should be aware of what they do not know. This thesis explores these ideas using concepts from geometry and uncertainty. Specifically, we show how to improve end-to-end deep learning models by leveraging the underlying geometry of the problem. We explicitly model concepts such as epipolar geometry to learn with unsupervised learning, which improves performance. Secondly, we introduce ideas from probabilistic modelling and Bayesian deep learning to understand uncertainty in computer vision models. We show how to quantify different types of uncertainty, improving safety for real world applications.

Date

2017-11-30

Advisors

Cipolla, Roberto

Keywords

Deep Learning, Computer Vision, Machine Learning, Robotics

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Sponsorship

My PhD was funded by the Woolf Fisher Trust

Collections

Theses - Engineering

Geometry and Uncertainty in Deep Learning for Computer Vision

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Sponsorship

Collections