Repository logo
 

Vision Language Model-driven Multimodal Occlusion Analysis (VLM-MOA) Method for Construction Progress Monitoring

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

Accurate and timely construction progress updates are vital for effective project management. Drone-based data collection offers a cost-effective, efficient method for generating point cloud scans to visualize and track progress. However, limitations in drone flight data collection, which captures only surface-level scenes, can lead to undetected constructed elements during long intervals between flights, resulting in inaccurate progress assessments. This study introduces an innovative approach that integrates multimodal information with Vision Language Models (VLMs) to infer construction progress. A case study in Poland utilized periodic drone-generated point cloud scans to monitor a construction project. The proposed method employs point proximity metrics and occlusion analysis, leveraging VLM-driven spatial and logical reasoning to address data gaps and infer occluded progress. Results validate the efficacy of this multimodal VLM-driven approach. Future work will focus on completing the overall workflow.

Description

Keywords

Journal Title

Conference Name

ASCE International Conference on Computing in Civil Engineering (i3CE 2025)

Journal ISSN

Volume Title

Publisher

Publisher DOI

Publisher URL

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
Engineering and Physical Sciences Research Council (EP/S02302X/1)