Repository logo
 

Towards Acoustic-to-Articulatory Inversion for Pronunciation Training

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

McGhee, Charles 
Knill, Kate 
Gales, Mark 

Abstract

Visual feedback of articulators using Electromagnetic- Articulography (EMA) has been shown to aid acquisition of non-native speech sounds. Using physical EMA sensors is expensive and invasive making it impractical for providing real-world pronunciation feedback. Our work focuses on us- ing neural Acoustic-to-Articulatory Inversion (AAI) models to map speech directly to EMA sensor positions. Self-Supervised Learning (SSL) speech models, such as HuBERT, can produce representations of speech that have been shown to significantly improve performance on AAI tasks. Probing experiments have indicated that certain layers and iterations of SSL models pro- duce representations that may yield better inversion perfor- mance than others. In this paper, we build on these probing results to create an AAI model that improves upon a state-of- the-art baseline inversion model and evaluate the model’s suit- ability for pronunciation training.

Description

Keywords

Journal Title

Conference Name

The 9th Workshop on Speech and Language Technology in Education

Journal ISSN

Volume Title

Publisher

Publisher DOI

Publisher URL

Sponsorship
Cambridge Assessment (unknown)
EPSRC DTP and Vice Chancellor's Award.
Relationships
Is supplemented by: