Abstract | ||
---|---|---|
In this paper, we propose methods to compute confidence score on the predictions made by an end-to-end speech recognition model in a 2-pass framework. We use RNN-Transducer for a streaming model, and an attention-based decoder for the second pass model. We use neural technique to compute the confidence score, and experiment with various combinations of features from RNN-Transducer and second pass models. The neural confidence score model is trained as a binary classification task to accept or reject a prediction made by speech recognition model. The model is evaluated in a distributed speech recognition environment, and performs significantly better when features from second pass model are used as compared to the features from streaming model. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/ICASSP39728.2021.9414467 | 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) |
Keywords | DocType | Citations |
Neural confidence measure, end-to-end speech recognition, RNN-Transducers, Two pass | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ashutosh Gupta | 1 | 1 | 2.38 |
Ankur N Kumar | 2 | 8 | 3.39 |
Dhananjaya Gowda | 3 | 3 | 5.47 |
Kwangyoun Kim | 4 | 2 | 4.11 |
Sachin Singh | 5 | 0 | 1.35 |
Shatrughan Singh | 6 | 1 | 2.71 |
Chanwoo Kim | 7 | 253 | 28.44 |