Repository logo
 

Chain-of-Thought-based Knowledge Extraction from Heterogeneous Infrastructure Database for Integrated Transportation Asset Management

Accepted version
Peer-reviewed

Change log

Abstract

The fragmentation of infrastructure information systems has long been an obstacle to integrated transportation asset management (TAM). This paper presents a novel method for automatic knowledge extraction and ontology modelling from heterogeneous TAM databases using large language models (LLMs). The method adopts a Chain-of-Thought framework to decompose the complex ontology modelling process into atomic tasks, harnessing the semantic understanding and reasoning capabilities of LLMs. As a result, class entities, class hierarchies, and relations are generated to construct an ontology model that supports semantic interoperability across diverse TAM systems. The method’s performance was evaluated using four sets of TAM database schemas from UK road agencies. The results show that the overall recall rate for entity generation reaches 89.5% compared to the standard ontology. Furthermore, the accuracy rates for entity classification and relation classification are 82.1% and 75.6%, respectively, demonstrating the effectiveness of the proposed LLM-based approach in addressing data fragmentation issues in transportation information systems.

Description

Keywords

Journal Title

Conference Name

IEEE Intelligent Transportation Systems Society Conference (IEEE ITSC 2025)

Journal ISSN

Volume Title

Publisher

Publisher DOI

Publisher URL

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
European Horizon 2020