Paraphrastic language models and combination with neural network language models

Liu, X; Gales, MJF; Woodland, PC

Paraphrastic language models and combination with neural network language models

Repository URI

https://www.repository.cam.ac.uk/handle/1810/245536

Files

Main article (117.02 KB)

Type

Article

Authors

Liu, X

Gales, MJF

Woodland, PC

Abstract

In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, paraphrastic LMs were proposed in previous research and successfully applied to a US English conversational telephone speech transcription task. In order to exploit the complementary characteristics of paraphrastic LMs and neural network LMs (NNLM), the combination between the two is investigated in this paper. To investigate paraphrastic LMs’ generalization ability to other languages, experiments are conducted on a Mandarin Chinese broadcast speech transcription task. Using a paraphrastic multi-level LM modelling both word and phrase sequences, signiﬁcant error rate reductions of 0.9% absolute (9% relative) and 0.5% absolute (5% relative) were obtained over the baseline n-gram and NNLM systems respectively, after a combination with word and phrase level NNLMs.

Keywords

language model, paraphrase, speech recognition

Journal Title

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Journal ISSN

1520-6149

Publisher

IEEE

Publisher DOI

https://doi.org/10.1109/ICASSP.2013.6639308

Rights

DSpace@Cambridge license

Sponsorship

The research leading to these results was supported by EPSRC Programme Grant EP/I031022/1 (Natural Speech Technology)

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match