Paraphrastic neural network language models


Type
Conference Object
Change log
Authors
Liu, X 
Gales, MJF 
Woodland, PC 
Abstract

Expressive richness in natural languages presents a significant challenge for statistical language models (LM). As multiple word sequences can represent the same underlying meaning, only modelling the observed surface word sequence can lead to poor context coverage. To handle this issue, paraphrastic LMs were previously proposed to improve the generalization of back-off n-gram LMs. Paraphrastic neural network LMs (NNLM) are investigated in this paper. Using a paraphrastic multi-level feedforward NNLM modelling both word and phrase sequences, significant error rate reductions of 1.3% absolute (8% relative) and 0.9% absolute (5.5% relative) were obtained over the baseline n-gram and NNLM systems respectively on a state-of-the-art conversational telephone speech recognition system trained on 2000 hours of audio and 545 million words of texts.

Description
Keywords
neural network language model, paraphrase, speech recognition
Journal Title
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference Name
ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Journal ISSN
1520-6149
Volume Title
Publisher
IEEE
Sponsorship
The research leading to these results was supported by EPSRC grant EP/I031022/1 (Natural Speech Technology) and DARPA under the Broad Operational Language Translation (BOLT) program.