Efficient lattice rescoring using recurrent neural network language models
Change log
Authors
Abstract
Recurrent neural network language models (RNNLM) have become an increasingly popular choice for state-of-the-art speech recognition systems due to their inherently strong generalization performance. As these models use a vector representation of complete history contexts, RNNLMs are normally used to rescore N-best lists. Motivated by their intrinsic characteristics, two novel lattice rescoring methods for RNNLMs are investigated in this paper. The first uses an n-gram style clustering of history contexts. The second approach directly exploits the distance measure between hidden history vectors. Both methods produced 1-best performance comparable with a 10k-best rescoring baseline RNNLMsystem on a large vocabulary conversational telephone speech recognition task. Significant lattice size compression of over 70% and consistent improvements after confusion network (CN) decoding were also obtained over the N-best rescoring approach.
Description
This is the accepted manuscript of a paper published in the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, Issue Date: 4-9 May 2014, Written by: Liu, X.; Wang, Y.; Chen, X.; Gales, M.J.F.; Woodland, P.C.).