Vision-based excavator pose estimation using synthetically generated datasets with domain randomization
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
The ability to monitor and track the interactions between construction equipment and workers can lead to creating a safer and more productive work environment. Most recent studies employ computer vision and deep learning techniques, which rely on the size and quality of the training datasets for optimal performance. However, preparation of large datasets with high quality annotations remains a manual and time-consuming process. To overcome this challenge, this study presents a framework for synthetically generating large and accurately annotated images. The contribution of this paper is manifold: First, a method is developed using a game engine, which employs domain randomization (DR) to produce large labelled datasets for excavator pose estimation. Second, a state-of-the-art deep learning architecture based on high representation network is adapted and modified for excavator pose estimation. This model is trained on synthetically generated datasets and its performance is evaluated. The results reveal that the model trained on synthetic data can yield comparable results to the model trained on real images of excavators. This demonstrates the effectiveness of utilizing synthetic datasets for complex vision tasks such as equipment pose estimation. The study concludes by highlighting directions for further work in synthetic data studies in construction.
Description
Keywords
Journal Title
Conference Name
Journal ISSN
1872-7891
Volume Title
Publisher
Publisher DOI
Sponsorship
Leverhulme Trust (IAF-2018-011)
Australian Research Council (DP170104613)
European Commission Horizon 2020 (H2020) Industrial Leadership (IL) (958398)
EPSRC (EP/V056441/1)