A rainfall similarity-based dataset construction framework for enhanced urban inundation prediction using machine learning
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Accurate prediction of urban inundation depends heavily on the quality of datasets used in machine learning (ML) models. To address this need, this study proposes a rainfall similarity-based dataset construction framework that enhances data-driven inundation forecasting by incorporating process-oriented rainfall features. The framework employs a multi-distance fusion method to quantify similarity among rainfall events based on key process-oriented features, including total rainfall depth, rainfall duration, maximum rainfall intensity, rainfall center location, and spatial distribution pattern. Representative events are then curated to build high-quality training datasets that align with the dynamics of rainfall–inundation processes. To evaluate performance, both Random Forest (RF) and Deep Neural Network (DNN) models were tested using integrated datasets combining observational records and hydrodynamic simulations from a flood-prone metropolitan area. Results show that similarity-guided training significantly improves predictive performance, with inundation-extent accuracy exceeding 85 % on average and approaching 95 % in certain scenarios. While both ML models benefit from the framework, RF consistently achieves higher accuracy than DNN, indicating strong synergy with similarity-based dataset construction. This study highlights the pivotal role of similarity-guided datasets in bridging rainfall process analysis and ML modeling, providing a scalable and practical tool for reliable urban inundation prediction and disaster mitigation.
Description
Journal Title
Conference Name
Journal ISSN
1879-2707
Volume Title
Publisher
Publisher DOI
Rights and licensing
Sponsorship
National Natural Science Foundation of China
University of Cambridge
China Scholarship Council

