Recurrent neural network language model training with noise contrastive estimation for speech recognition

Chen, X; Liu, X; Gales, MJF; Woodland, PC

Recurrent neural network language model training with noise contrastive estimation for speech recognition

Repository URI

https://www.repository.cam.ac.uk/handle/1810/247439

Files

Chen_et_al-2015-ICASSP.pdf (95.49 KB)

Type

Conference Object

Authors

Chen, X

Liu, X

Gales, MJF

Woodland, PC

Abstract

In recent years recurrent neural network language models (RNNLMs) have been successfully applied to a range of tasks including speech recognition. However, an important issue that limits the quantity of data used, and their possible application areas, is the computational cost in training. A significant part of this cost is associated with the softmax function at the output layer, as this requires a normalization term to be explicitly calculated. This impacts both the training and testing speed, especially when a large output vocabulary is used. To address this problem, noise contrastive estimation (NCE), is used in RNNLM training in this paper. It does not require the above normalization during both training and testing and is insensitive to the output layer size. On a large vocabulary conversational telephone speech recognition task, a doubling in training speed and 56 time speed up in test time evaluation were obtained.

Keywords

language model, recurrent neural network, GPU, noise contrastive estimation, speech recognition

Journal Title

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Conference Name

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Journal ISSN

1520-6149

Publisher

IEEE

Publisher DOI

https://doi.org/10.1109/ICASSP.2015.7179005

Rights

http://www.rioxx.net/licenses/all-rights-reserved

Sponsorship

Xie Chen is supported by Toshiba Research Europe Ltd, Cambridge Research Lab. The research leading to these results was also supported by EPSRC grant EP/I031022/1 (Natural Speech Technology) and DARPA under the Broad Operational Language Translation (BOLT) and RATS programs. The paper does not necessarily reflect the position or the policy of US Government and no official endorsement should be inferred. The authos also would like to thanks Ashish Vaswani from USC for suggestions and discussion on training of NNLMs with NCE.

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match