Repository logo

An Automatic Method for Speech Breathing Annotation

Accepted version


Conference Object

Change log


Werner, Raphael 


Breathing is central to speech planning and production; however, speech breathing is difficult to monitor and quantify without laborious and subjective manual annotation. Here, we describe a method for automatically detecting the beginning and end time points of speech-associated inhalations measured with inductive plethysmography, or breath belts. Unlike simpler approaches to breath detection, the technique introduced here employs slope analysis to improve temporal precision. First, inhalation events are identified by searching for roughly continuous, positive sloping segments. Inhalations are then rejected or modified based on slope height, duration, and grade, as well as contextual factors, such as the height or duration of neighbouring breaths. Finally, the respiratory time series can be optionally corroborated with acoustic recordings to further improve results. This approach is validated by two independent annotators using spontaneous and read English speech contributed by 10 individual speakers, including relatively noisy data. From a signal detection perspective, we estimate performance at 95% on average. The mean median error of detected breaths, when compared to human annotation, is 22.50 ms (IQR 37.71 ms). By comparison, a peak-finding method without acoustic calibration yields 91% accuracy with substantially larger errors (mean median 167.90 ms, IQR 381.45 ms). In conclusion, the proposed automatic method provides robust and temporally accurate annotation of the speech breathing time series.



Journal Title

Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2023

Conference Name

Konferenz Elektronische Sprachsignalverarbeitung (ESSV)

Journal ISSN


Volume Title


TUDpress, Dresden

Publisher DOI

Publisher URL