Semantic localisation via globally unique instance segmentation

Budvytis, I; Sauer, P; Cipolla, R

Semantic localisation via globally unique instance segmentation

Published version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/296049

Repository DOI

https://doi.org/10.17863/CAM.43095

Files

Published version (10.83 MB)

Type

Conference Object

Authors

Budvytis, I

Sauer, P

Cipolla, Roberto

https://orcid.org/0000-0002-8999-2151

Abstract

In this work we propose a novel approach to semantic localisation. Our work is motivated by the need for environment perception techniques which not only perform self-localisation within a map but also simultaneously recognise surrounding objects. Such capabilities are crucial for computer vision applications which interact with the environment: autonomous driving, augmented reality or robotics. In order to achieve this goal we propose a solution which consists of three key steps. Firstly, a database of panoramic RGB images and corresponding globally unique, per-pixel object instance labels is built for the desired environment where we typically consider objects from static categories such as "building" or "tree". Secondly, a semantic segmentation network capable of predicting more than 3000 labels is trained on the collected data. Finally, for a given panoramic query image, the corresponding instance label image predicted by the network is used for semantic matching within the database. The matching is performed in two stages: (i) a fast retrieval of a small subset of database images (~100) with highly overlapping instance label histograms, followed by (ii) an explicit approximate 3 DoF (yaw, pitch, roll) alignment of the selected subset of images and the query image. We evaluate our approach in challenging indoor and outdoor navigation scenarios, achieving better or similar performance when compared to state-of-the-art image retrieval-based localisation approaches using key-point matching and image level embedding. Our contribution includes: (i) a description of a novel semantic localisation approach using globally unique instance segmentation, (ii) corresponding quantitative and qualitative analysis and (iii) a novel CamVid-360 dataset containing 986 labelled instances of buildings, trees, road signs and poles.

Journal Title

British Machine Vision Conference 2018, BMVC 2018

Conference Name

British Machine Vision Conference, BMVC 2018

Publisher DOI

https://doi.org/10.17863/CAM.43095

Rights

Collections

Cambridge University Research Outputs