Repository logo
 

Voice activity detection in eco-acoustic data enables privacy protection and is a proxy for human disturbance

Published version
Peer-reviewed

Change log

Abstract

Abstract

Eco‐acoustic monitoring is increasingly being used to map biodiversity across large scales, yet little thought is given to the privacy concerns and potential scientific value of inadvertently recorded human speech. Automated speech detection is possible using voice activity detection (VAD) models, but it is not clear how well these perform in diverse natural soundscapes. In this study we present the first evaluation of VAD models for anonymization of eco‐acoustic data and demonstrate how speech detection frequency can be used as one potential measure of human disturbance.

We first generated multiple synthetic datasets using different data preprocessing techniques to train and validate deep neural network models. We evaluated the performance of our custom models against existing state‐of‐the‐art VAD models using playback experiments with speech samples from a man, woman and child. Finally, we collected long‐term data from a Norwegian forest heavily used for hiking to evaluate the ability of the models to detect human speech and quantify a proxy for human disturbance in a real monitoring scenario.

In playback experiments, all models could detect human speech with high accuracy at distances where the speech was intelligible (up to 10 m). We showed that training models using location specific soundscapes in the data preprocessing step resulted in a slight improvement in model performance. Additionally, we found that the number of speech detections correlated with peak traffic hours (using bus timings) demonstrating how VAD can be used to derive a proxy for human disturbance with fine temporal resolution.

Anonymizing audio data effectively using VAD models will allow eco‐acoustic monitoring to continue to deliver invaluable ecological insight at scale, while minimizing the risk of data misuse. Furthermore, using speech detections as a proxy for human disturbance opens new opportunities for eco‐acoustic monitoring to shed light on nuanced human–wildlife interactions.

Description

Journal Title

Methods in Ecology and Evolution

Conference Name

Journal ISSN

2041-210X
2041-210X

Volume Title

Publisher

Wiley

Rights and licensing

Except where otherwised noted, this item's license is described as http://creativecommons.org/licenses/by-nc/4.0/
Sponsorship
Norges Forskningsråd (160022/F40)