Title
Accelerating Rnn Transducer Inference Via Adaptive Expansion Search
Abstract
Recurrent neural network transducers (RNN-T) are a promising end-to-end speech recognition framework that transduce input acoustic frames to a character sequence. Best- and breadth-first searches have been used as decoding strategies for RNN-T. However, best-first search follows a sequential process for its expansion search, which slows down the decoding process. Although breadth-first search replaces the sequential process of best-first search with a parallel one, it unnecessarily conducts an expansion search for all decoding steps. As most of the decoding frames correspond to a blank symbol because the length of the character sequence is much shorter than that of the decoding frames, this induces computational overhead. To address these limitations, we introduce an adaptive expansion search (AES) to accelerate RNN-T inference. AES overcomes the aforementioned limitations by batching the hypotheses and adopting a decision-making process that decides whether to continue the expansion search; thus, AES can avoid unnecessary expansion search. Furthermore, pruning is applied to AES for further acceleration. We achieved significant speedup and a lower word error rate compared with other baselines.
Year
DOI
Venue
2020
10.1109/LSP.2020.3036335
IEEE SIGNAL PROCESSING LETTERS
Keywords
DocType
Volume
Decoding, Speech recognition, Acoustic beams, Acceleration, Acoustics, Speech processing, Indexes, Beam search, RNN transducer
Journal
27
ISSN
Citations 
PageRank 
1070-9908
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Juntae Kim198.72
Yoonhan Lee200.34
Eesung Kim311.73