Repository logo
 

Comparing Few-Shot Learning with LLMs for Efficient Text Classification in Road Maintenance Applications

Accepted version
Peer-reviewed

Change log

Abstract

Efficient road maintenance is vital for long-lasting and safe transportation networks, but traditional methods that rely on manual inspection are labour-intensive and error-prone. The integration of Natural Language Processing (NLP) and Large Language Models (LLMs) presents a transformative solution for automating text-based tasks in road maintenance. This study investigates the application of LLM-based text classification models to process unstructured textual data from road maintenance logs, with a focus on resource-constrained scenarios characterised by limited labelled datasets. Two primary approaches were evaluated: traditional fine-tuning and few-shot learning. Using a public dataset from New York City roadwork inspections containing 83 distinct classes, we performed extensive model comparisons. Pre-trained transformer models Llama and BERT were fine-tuned to achieve baseline performance. Additionally, a few-shot learning method, SetFit, was employed to address data scarcity through efficient task adaptation using minimal labelled examples. Results showed SetFit outperformed fine-tuning in low-resource scenarios, achieving high accuracy and F1 scores with as few as 1-10 examples per class, reducing annotation efforts. This highlights the potential of few-shot learning for real-world deployment. Future work will address scalability, multimodal data fusion, and integration into predictive maintenance.

Description

Journal Title

Proceedings of the International Symposium on Automation and Robotics in Construction (IAARC)

Conference Name

Proceedings of the 42nd International Symposium on Automation and Robotics in Construction

Journal ISSN

2413-5844

Volume Title

Publisher

International Association for Automation and Robotics in Construction (IAARC)

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved
Sponsorship
European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (101034337)
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 101034337.