Accelerating Rnn Transducer Inference Via Adaptive Expansion Search - Citegraph

Paper Info

Title
Accelerating Rnn Transducer Inference Via Adaptive Expansion Search

Abstract
Recurrent neural network transducers (RNN-T) are a promising end-to-end speech recognition framework that transduce input acoustic frames to a character sequence. Best- and breadth-first searches have been used as decoding strategies for RNN-T. However, best-first search follows a sequential process for its expansion search, which slows down the decoding process. Although breadth-first search replaces the sequential process of best-first search with a parallel one, it unnecessarily conducts an expansion search for all decoding steps. As most of the decoding frames correspond to a blank symbol because the length of the character sequence is much shorter than that of the decoding frames, this induces computational overhead. To address these limitations, we introduce an adaptive expansion search (AES) to accelerate RNN-T inference. AES overcomes the aforementioned limitations by batching the hypotheses and adopting a decision-making process that decides whether to continue the expansion search; thus, AES can avoid unnecessary expansion search. Furthermore, pruning is applied to AES for further acceleration. We achieved significant speedup and a lower word error rate compared with other baselines.

Year	DOI	Venue
2020	10.1109/LSP.2020.3036335	IEEE SIGNAL PROCESSING LETTERS
Keywords	DocType	Volume
Decoding, Speech recognition, Acoustic beams, Acceleration, Acoustics, Speech processing, Indexes, Beam search, RNN transducer	Journal	27
ISSN	Citations	PageRank
1070-9908	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Juntae Kim	1	9	8.72
Yoonhan Lee	2	0	0.34
Eesung Kim	3	1	1.73

1