Repository logo
 

Stimulated training for automatic speech recognition and keyword search in limited resource conditions

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Ragni, A 
Gales, MJF 
Vasilakes, J 
Knill, KM 

Abstract

© 2017 IEEE. Training neural network acoustic models on limited quantities of data is a challenging task. A number of techniques have been proposed to improve generalisation. This paper investigates one such technique called stimulated training. It enables standard criteria such as cross-entropy to enforce spatial constraints on activations originating from different units. Having different regions being active depending on the input unit may help network to discriminate better and as a consequence yield lower error rates. This paper investigates stimulated training for automatic speech recognition of a number of languages representing different families, alphabets, phone sets and vocabulary sizes. In particular, it looks at ensembles of stimulated networks to ensure that improved generalisation will withstand system combination effects. In order to assess stimulated training beyond 1-best transcription accuracy, this paper looks at keyword search as a proxy for assessing quality of lattices. Experiments are conducted on IARPA Babel program languages including the surprise language of OpenKWS 2016 competition.

Description

Keywords

limited resources, stimulated training, joint decoding, keyword search

Journal Title

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Conference Name

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Journal ISSN

1520-6149

Volume Title

Publisher

IEEE
Sponsorship
IARPA (4912046943)