I-vector estimation using informative priors for adaptation of deep neural networks

Karanasou, P; Gales, M; Woodland, P

I-vector estimation using informative priors for adaptation of deep neural networks

Repository URI

https://www.repository.cam.ac.uk/handle/1810/250523

Files

Karanasou, Gales, Woodland 2015 Interspeech.pdf (176.7 KB)

Type

Article

Authors

Karanasou, P

Gales, M

Woodland, P

Abstract

I-vectors are a well-known low-dimensional representation of speaker space and are becoming increasingly popular in adaptation of state-of-the-art deep neural network (DNN) acoustic models. One advantage of i-vectors is that they can be used with very little data, for example a single utterance. However, to improve robustness of the i-vector estimates with limited data, a prior is often used. Traditionally, a standard normal prior is applied to i-vectors, which is nevertheless not well suited to the increased variability of short utterances. This paper proposes a more informative prior, derived from the training data. As well as aiming to reduce the non-Gaussian behaviour of the i-vector space, it allows prior information at different levels, for example gender, to be used. Experiments on a US English Broadcast News (BN) transcription task for speaker and utterance i-vector adaptation show that more informative priors reduce the sensitivity to the quantity of data used to estimate the i-vector. The best configuration for this task was utterance-level test i-vectors enhanced with informative priors which gave a 13% relative reduction in word error rate over the baseline (no i-vectors) and a 5% over utterance-level test i-vectors with standard prior.

Description

This is the author accepted manuscript. The final version is available from ISCA via http://www.isca-speech.org/archive/interspeech_2015/i15_2872.html

Supporting data for this paper is available at the http://www.repository.cam.ac.uk/handle/1810/248387 data repository.

Keywords

i-vectors, speaker adaptation, prior information, deep neural networks

Journal Title

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Journal ISSN

2308-457X
1990-9772

Publisher

ISCA

Publisher DOI

https://doi.org/10.21437/interspeech.2015-604

Rights

Attribution-NonCommercial 2.0 UK: England & Wales

Sponsorship

This work was supported by EPSRC Programme Grant EP/I031022/1 (Natural Speech Technology).

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match