Repository logo
 

Benchmarking tree species classification from proximally sensed laser scanning data: Introducing the FOR‐species20K dataset

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

Abstract Proximally sensed laser scanning presents new opportunities for automated forest ecosystem data capture. However, a gap remains in deriving ecologically pertinent information, such as tree species, without additional ground data. Artificial intelligence approaches, particularly deep learning (DL), have shown promise towards automation. Progress has been limited by the lack of large, diverse, and, most importantly, openly available labelled single‐tree point cloud datasets. This has hindered both (1) the robustness of the DL models across varying data types (platforms and sensors) and (2) the ability to effectively track progress, thereby slowing the convergence towards best practice for species classification. To address the above limitations, we compiled the FOR‐species20K benchmark dataset, consisting of individual tree point clouds captured using proximally sensed laser scanning data from terrestrial (TLS), mobile (MLS) and drone laser scanning (ULS). Compiled collaboratively, the dataset includes data collected in forests mainly across Europe, covering Mediterranean, temperate and boreal biogeographic regions. It includes scattered tree data from other continents, totaling over 20,000 trees of 33 species and covering a wide range of tree sizes and forms. Alongside the release of FOR‐species20K, we benchmarked seven leading DL models for individual tree species classification, including both point cloud (PointNet++, MinkNet, MLP‐Mixer, DGCNNs) and multi‐view 2D‐based methods (SimpleView, DetailView, YOLOv5). 2D Image‐based models had, on average, higher overall accuracy (0.77) than 3D point cloud‐based models (0.72). Notably, the performance was consistently >0.8 across scanning platforms and sensors, offering versatility in deployment. The top‐scoring model, DetailView, demonstrated robustness to training data imbalances and effectively generalized across tree sizes. The FOR‐species20K dataset represents an important asset for developing and benchmarking DL models for individual tree species classification using proximally sensed laser scanning data. As such, it serves as a crucial foundation for future efforts to classify accurately and map tree species at various scales using laser scanning technology, as it provides the complete code base, dataset, and an initial baseline representative of the current state‐of‐the‐art of point cloud tree species classification methods.

Description

Journal Title

Methods in Ecology and Evolution

Conference Name

Journal ISSN

2041-210X
2041-210X

Volume Title

Publisher

Wiley

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
MRC (MR/T019832/1)
EPSRC (2413201)
This work was supported by the COST Action 3DForEcoTech (CA20118). This work is part of the Center for Research-based Innovation SmartForest: Bringing Industry 4.0 to the Norwegian forest sector (NFR SFI project no. 309671, smartforest.no). ERL and HJFO were funded by a UKRI Future Leaders Fellowship awarded to E.R.L. (MR/T019832/1). MJA was supported by the UKRI Centre for Doctoral Training in Application of Artificial Intelligence to the study of Environmental Risks (EP/S022961/1). We acknowledge the technical support and compute time at the Vienna Scientific Cluster VSC-5 for parts of the Ensemble-PointNet++ results. LW was funded in part by the Austrian Science Fund (FWF) [J4672]. The contribution of the DetailView model was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project FR 4404/1-1. NS was supported by the Academy of Finland through UNITE Flagship (357906) and Scan4rest Research Infrastructure (346382). KC was funded by the European Union (ERC-2021-STG Grant agreement No. 101039795). ET was funded by INEST - PNRR (Italian National Plan for Recovery and Resilience), Project id, ECS00000043. REMBIOFOR dataset was funded by National Centre for Research and Development in Poland under the BIOSTRATEG programme (grant agreement number BIOSTRATEG1/267755/4/NCBR/2015), project REMBIOFOR ‘Remote sensing-based assessment of woody biomass and carbon storage in forests’. KK, AM and MK were funded by INTER-COST project LUC23023.

Relationships

Is supplemented by: