Annotating large lattices with the exact word error

Van Dalen, RC; Gales, MJF

Annotating large lattices with the exact word error

Repository URI

https://www.repository.cam.ac.uk/handle/1810/248684

Files

van Dalen and Gales 2015 Interspeech 2015.pdf (251.34 KB)

Type

Conference Object

Authors

Van Dalen, RC

Gales, MJF

Abstract

The acoustic model in modern speech recognisers is trained discriminatively, for example with the minimum Bayes risk. This criterion is hard to compute exactly, so that it is normally approximated by a criterion that uses fixed alignments of lattice arcs. This approximation becomes particularly problematic with new types of acoustic models that require flexible alignments. It would be best to annotate lattices with the risk measure of interest, the exact word error. However, the algorithm for this uses finite-state automaton determinisation, which has exponential complexity and runs out of memory for large lattices. This paper introduces a novel method for determinising and minimising finite-state automata incrementally. Since it uses less memory, it can be applied to larger lattices.

Keywords

speech recognition, discriminative training, minimum Bayes risk

Journal Title

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

Journal ISSN

2308-457X
1990-9772

Publisher

ISCA

Publisher URL

http://www.isca-speech.org/archive/interspeech_2015/i15_2625.html

Rights and licensing

Except where otherwised noted, this item's license is described as http://www.rioxx.net/licenses/all-rights-reserved

Sponsorship

This work was supported by EPSRC Project EP/I006583/1 (Generative Kernels and Score Spaces for Classification of Speech) within the Global Uncertainties Programme and by a Google Research Award.

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match