Comparing Few-Shot Learning with LLMs for Efficient Text Classification in Road Maintenance Applications

Efficient road maintenance is vital for long-lasting and safe transportation networks, but traditional methods that rely on manual inspection are labour-intensive and error-prone. The integration of Natural Language Processing (NLP) and Large Language Models (LLMs) presents a transformative solution for automating text-based tasks in road maintenance. This study investigates the application of LLM-based text classification models to process unstructured textual data from road maintenance logs, with a focus on resource-constrained scenarios characterised by limited labelled datasets. Two primary approaches were evaluated: traditional fine-tuning and few-shot learning. Using a public dataset from New York City roadwork inspections containing 83 distinct classes, we performed extensive model comparisons. Pre-trained transformer models Llama and BERT were fine-tuned to achieve baseline performance. Additionally, a few-shot learning method, SetFit, was employed to address data scarcity through efficient task adaptation using minimal labelled examples. Results showed SetFit outperformed fine-tuning in low-resource scenarios, achieving high accuracy and F1 scores with as few as 1-10 examples per class, reducing annotation efforts. This highlights the potential of few-shot learning for real-world deployment. Future work will address scalability, multimodal data fusion, and integration into predictive maintenance.

Keywords

4605 Data Management and Data Science, 46 Information and Computing Sciences, 3 Good Health and Well Being

Journal Title

Proceedings of the International Symposium on Automation and Robotics in Construction (IAARC)

Conference Name

Proceedings of the 42nd International Symposium on Automation and Robotics in Construction

Journal ISSN

2413-5844

Publisher

International Association for Automation and Robotics in Construction (IAARC)

Publisher DOI

https://doi.org/10.22260/isarc2025/0132

Rights and licensing

Sponsorship

European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (101034337)

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 101034337.

Collections

University of Cambridge Research Outputs (Articles and Conferences)