Development and evaluation of a multiscale keypoint detector based on complex wavelets

Bendale, Pashmina Ziparu

Development and evaluation of a multiscale keypoint detector based on complex wavelets

Repository URI

https://www.repository.cam.ac.uk/handle/1810/252226

Repository DOI

https://doi.org/10.17863/CAM.41623

Files

Thesis (11.98 MB)

Type

Thesis

Authors

Bendale, Pashmina Ziparu

Abstract

This thesis develops a multiscale keypoint detector and descriptor based on the Dual-Tree Complex Wavelet Transform (DTCWT). First, we develop a scale-space framework called the 4S-DTCWT that uses the dyadic decomposition of the DTCWT but achieves denser sampling in scale by interleaving several DTCWT trees, leading to reduced scale-related aliasing. This forms the foundation for the rest of our work. Then, we present a new DTCWT based keypoint detector (BTK), which exhibits improved spatial localisation owing to the use of a more selective cornerness measure and keypoint localisation in individual levels in the 4S-DTCWT. A number of scale refinement approaches are investigated. The improved keypoint position and scale localisation directly leads to more robust image characterisation using DTCWT based visual descriptors. We also present some ways of speeding up both the descriptor and the matching computations. These changes make it possible to use the system in practical scenarios. We develop a novel, fully automated framework for the evaluation of keypoint detectors and descriptors. This includes a new dataset containing 3978 calibrated images from 2 cameras of 39 different toy cars on a turntable. The dataset, calibration images, inter-camera calibration, rotational calibration and test scripts are publicly available. We establish ground truth correspondences using a three-image setup, with fixed angular separation between two of the three views, thus reducing the dependency on angular separation when compared to conventional epipolar line search. Various keypoint detectors and descriptors were compared with DTCWT based methods using this framework. To the extent possible, we separated the evaluation of the keypoint detectors from that of the descriptors. The main conclusions were that DTCWT based methods can achieve a performance comparable, if not superior, to that of established methods. We also showed that, although repeatability of keypoint detections falls off reasonably steeply with change in viewing angle, conditioned on an associated keypoint being detected at a reasonably correct corresponding location, descriptor similarity is hardly affected by viewpoint variation. Finally, we show how an evaluation that is based purely on the prior knowledge of the geometry of the scene can be useful in eliminating the inaccuracies involved in appearance based evaluations. This uses an enhanced epipolar constraint that exploits both positions and scales of keypoints to constrain the range of possible matches.

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Collections

Theses - Engineering