Animal Reconstruction with 3D Morphable Models

Biggs, Benjamin

Animal Reconstruction with 3D Morphable Models

Repository URI

https://www.repository.cam.ac.uk/handle/1810/338198

Repository DOI

https://doi.org/10.17863/CAM.85609

Files

Thesis (36.82 MB)

Type

Thesis

Authors

Biggs, Benjamin

https://orcid.org/0000-0002-2419-8096

Abstract

Across many sectors concerned with animal husbandry, there is a growing need for automated tools for continuously monitoring animals under our care. In farmyards, zoos, veterinary centres, animal research facilities and many others, humans are typically responsible for identifying signs of disease or distress within their animal populations. While this can be effective, a significant challenge is posed when a small number of humans are expected to care for large animal groups. Existing automated systems often install cameras and use computer vision algorithms to track motion and/or predict behaviours. However, these approaches are either low fidelity or cannot be readily applied to identifying ill health due to ethical concerns around data collection. This thesis proposes a solution based on using 3D morphable models (3DMMs) as a useful intermediary 3D animal representation, enabling identification of adverse events and reducing the task-specific data requirements for downstream behaviour/welfare prediction tasks. A 3DMM is a specially designed mesh, supplied with parameters that constrain deformations. Typical reconstruction pipelines tackle reconstructing a 3D articulated structure as estimating the 3DMM parameters per input frame. This thesis focuses on designing methods for 3D animal reconstruction, making use of suitable 3DMMs. We as humans are highly proficient at estimating the 3D structure of articulated subjects, such as animals or people. Even from a single image or video sequence, a human can with reasonable accuracy, predict the 3D locations of the animal’s limbs, estimate body proportions and even the camera’s location and viewing direction. However, these remain challenging tasks for computers. For this reason, 3DMMs provide a useful intermediate representation for downstream tasks since they help to explain the system’s output – if the recovered 3D model looks wrong, a human will notice immediately. Recently, great progress has been made in 3D human reconstruction but there are particular chal- lenges which impede naively transferring these techniques to animal categories. Firstly, human recon- struction techniques use large, specialized datasets of images with corresponding 2D and 3D annotations to learn accurate 3DMMs and train neural networks. Unfortunately, animal datasets are extremely limited in size, number and variability which has led to significant prior work in animals requiring per-frame manual annotations at test time. This lack of data also contributes to difficulties in designing detailed 3D priors, which are needed to cover the enormous pose and shape diversity among animal species when compared to humans. Other challenges faced in this thesis are common to humans and animals alike, such as when tackling ambiguous input imagery, for example caused by self or environmental occlusions. This thesis proposes a series of methods which are designed to tackle these challenges. Chapter 3 begins by introducing two new animal datasets which are relied upon in subsequent chapters. Chapter 4 discusses the first approach that performs fully automatic 3D reconstruction on a wide range of quadruped species, based on a graphics pipeline for generating synthetic data. Chapter 5 introduces vi WLDO, an automatic and real-time system for 3D dog reconstruction which covers an approach for improving the representational power of a low-fidelity 3DMM; a common problem for animals, since the lack of 3D training data necessitates learning from other sources, such as artist-designed figurines. The approach operates without a real 3D dataset and produces state-of-the-art accuracy on a challenging new dataset StanfordExtra, outperforming energy minimization approaches even when they are given ground-truth test-time annotations. Chapter 6 proposes a technique for reconstructing images with heavy occlusion, which is an open problem across 3D reconstruction literature and a common failure mode of existing systems. The approach achieves state-of-the-art performance on challenging benchmarks.

Date

2021-12-20

Advisors

Cipolla, Roberto
Fitzgibbon, Andrew

Keywords

computer vision, morphable models, 3d reconstruction, animal tracking

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Collections

Theses - Engineering