Repository logo
 

Improving the Training and Evaluation Efficiency of Recurrent Neural Network Language Models


Change log

Abstract

Recurrent neural network language models (RNNLMs) are becoming increasingly popular for speech recognition. Previously, we have shown that RNNLMs with a full (non-classed) output layer (F-RNNLMs) can be trained efficiently using a GPU giving a large reduction in training time over conventional class-based models (C-RNNLMs) on a standard CPU. However, since test-time RNNLM evaluation is often performed entirely on a CPU, standard F-RNNLMs are inefficient since the entire output layer needs to be calculated for normalisation. In this paper, it is demonstrated that C-RNNLMs can be efficiently trained on a GPU, using our spliced sentence bunch technique which allows good CPU test-time performance $(42\times$ speedup over F-RNNLM). Furthermore, the performance of different classing approaches is investigated. We also examine the use of variance regularisation of the softmax denominator for F-RNNLMs and show that it allows F-RNNLMs to be efficiently used in test $(56\times$ speedup on a CPU). Finally the use of two GPUs for F-RNNLM training using pipelining is described and shown to give a reduction in training time over a single GPU by a factor of $16\times$.

Description

Journal Title

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Conference Name

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Journal ISSN

1520-6149

Volume Title

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Rights and licensing

Except where otherwised noted, this item's license is described as http://www.rioxx.net/licenses/all-rights-reserved
Sponsorship
Xie Chen is supported by Toshiba Research Europe Ltd, Cambridge Research Lab. The research leading to these results was also supported by EPSRC grant EP/I031022/1 (Natural Speech Technology) and DARPA under the Broad Operational Language Translation (BOLT) and RATS programs. The paper does not necessarily reflect the position or the policy of US Government and no official endorsement should be inferred.