Large Scale Labelled Video Data Augmentation for Semantic Segmentation in Driving Scenarios

In this paper we present an analysis of the effect of large scale video data augmentation for semantic segmentation in driving scenarios. Our work is motivated by a strong correlation between the high performance of most recent deep learning based methods and the availability of large volumes of ground truth labels. To generate additional labelled data, we make use of an occlusion-aware and uncertainty-enabled label propagation algorithm. As a result we increase the availability of high-resolution labelled frames by a factor of 20, yielding in a 6.8% to 10.8% rise in average classification accuracy and/or IoU scores for several semantic segmentation networks. Our key contributions include: (a) augmented CityScapes and CamVid datasets providing 56.2K and 6.5K additional labelled frames of object classes respectively, (b) detailed empirical analysis of the effect of the use of augmented data as well as (c) extension of proposed framework to instance segmentation.

Keywords

46 Information and Computing Sciences, 4603 Computer Vision and Multimedia Computation, Machine Learning and Artificial Intelligence, Networking and Information Technology R&D (NITRD)

Journal Title

Proceedings 2017 IEEE International Conference on Computer Vision Workshops Iccvw 2017

Conference Name

2017 IEEE International Conference on Computer Vision Workshop (ICCVW)

Journal ISSN

2473-9936

Volume Title

2018-January

Publisher

IEEE

Publisher DOI

https://doi.org/10.1109/ICCVW.2017.36

Rights and licensing

Except where otherwised noted, this item's license is described as http://www.rioxx.net/licenses/all-rights-reserved

Collections

Scholarly Works - Engineering