Minimal Labels, Maximum Gain. Image Classification with Graph-Based Semi-Supervised Learning

Sellars, Philip

doi:10.17863/CAM.83262

Minimal Labels, Maximum Gain. Image Classification with Graph-Based Semi-Supervised Learning

Repository URI

https://www.repository.cam.ac.uk/handle/1810/335828

Repository DOI

https://doi.org/10.17863/CAM.83262

Files

Thesis (42.35 MB)

Type

Thesis

Authors

Sellars, Philip

https://orcid.org/0000-0002-9800-7010

Abstract

In the last decade, the use and deployment of machine learning systems for computer vision has risen dramatically. To train a machine learning model it is often assumed that the practitioner has access to a large and representative labelled dataset from which they can optimise their model in a supervised manner. However, in many domains, there is a large cost to obtaining labelled data. In technical fields we need manual annotations from domain experts and for deep learning models we need large datasets to reduce over-fitting.

Acting as a potential solution, the paradigm of semi-supervised learning extracts information from both labelled and unlabelled data and reduces the number of labels needed for training. This thesis deals with the development of novel classical and deep machine learning approaches for semi-supervised image classification. Our approaches are centred around graph-based learning, and we apply them to a range of real-world problems including hyperspectral, natural and medical imaging.

Firstly, we propose and design a superpixel contracted semi-supervised learning framework to classify hyperspectral images. This approach is built around the p=2 graph Laplacian and uses over-segmentation to greatly reduce the size of the graph as well as providing a regularizing prior. Secondly, we combine graph based semi-supervised learning with deep neural networks and re-examine modern data ablation to create a state-of-the-art framework for natural image classification. Finally, we combine graph-based approaches, optimising the more demanding p=1 graph Laplacian, with deep neural networks architectures and apply it to the field of medical imaging. We design a general framework for diagnosis and apply it to chest X-rays, including the diagnosis of COVID-19. For all the approaches in the paper, we show, through rigorous experimental and detailed ablation studies, that our models produce state-of-the-art results and are competitive with fully supervised models whilst only using a fraction of the available labels.

Overall, the contributions of this thesis are focused on the design and implementation of new graph-based semi-supervised frameworks for image classification, which include geometrical and data constraints along with deep neural-networks. Highlighting the power of semi-supervised learning to overcome the need for costly labelled datasets.

Date

2021-10-01

Advisors

Schönlieb, Carola-Bibiane
Aviles-Rivero, Angelica

Keywords

Deep-Learning, Image Classification, Graphical Models, Semi-Supervised

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Sponsorship

EPSRC (1945976)
Engineering and Physical Sciences Research Council (1945976)

National Physics Laboratory; EPSRC

Collections

Theses - Applied Mathematics and Theoretical Physics

Minimal Labels, Maximum Gain. Image Classification with Graph-Based Semi-Supervised Learning

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Sponsorship

Collections