Language independent and unsupervised acoustic models for speech recognition and keyword spotting

Knill, KM; Gales, MJF; Ragni, A; Rath, SP

Language independent and unsupervised acoustic models for speech recognition and keyword spotting

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/279188

Repository DOI

https://doi.org/10.17863/CAM.26568

Files

Accepted version (242.47 KB)

Type

Conference Object

Authors

Knill, KM

Gales, MJF

Ragni, A

Rath, SP

Abstract

Copyright © 2014 ISCA. Developing high-performance speech processing systems for low-resource languages is very challenging. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to train a multi-language bottleneck DNN. Language dependent and/or multi-language (all training languages) Tandem acoustic models (AM) are then trained. This work considers a particular scenario where the target language is unseen in multi-language training and has limited language model training data, a limited lexicon, and acoustic training data without transcriptions. A zero acoustic resources case is first described where a multilanguage AM is directly applied, as a language independent AM (LIAM), to an unseen language. Secondly, in an unsupervised approach a LIAM is used to obtain hypotheses for the target language acoustic data transcriptions which are then used in training a language dependent AM. 3 languages from the IARPA Babel project are used for assessment: Vietnamese, Haitian Creole and Bengali. Performance of the zero acoustic resources system is found to be poor, with keyword spotting at best 60% of language dependent performance. Unsupervised language dependent training yields performance gains. For one language (Haitian Creole) the Babel target is achieved on the in-vocabulary data.

Keywords

speech recognition, low resource, multilingual

Journal Title

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Conference Name

Interspeech 2014

Journal ISSN

2308-457X
1990-9772

Publisher DOI

https://doi.org/10.17863/CAM.26568

Rights

http://www.rioxx.net/licenses/all-rights-reserved

Sponsorship

IARPA (4912046943)

Collections

Cambridge University Research Outputs