Controlling the Noise Robustness of End-to-End Automatic Speech Recognition Systems - Citegraph

Paper Info

Title
Controlling the Noise Robustness of End-to-End Automatic Speech Recognition Systems

Abstract
In this work, we propose a novel training scheme to modularize end-to-end systems. Our training scheme aims at altering the flow of information in an end-to-end system to use the kernels of this system for another system that fulfills another task. We apply this scheme to extract the noise reduction capabilities from a noise-robust automatic speech recognition (ASR) system and implement a speech enhancer from it. This enhancer receives spectral representations from unfiltered audio and outputs cleaned spectral representations. Our enhancer can be integrated into an ASR system as front-end, is trainable, and reduces background noise. Our front-end uses a decoder to clean speech based on the hidden activations of the ASR system Jasper. While training, we exclusively adapt the weights in our decoder and the batch normalization in Jasper. The resulting spectral representations show less background noise. Further, areas in the spectral features are not reconstructed if they do not contribute to speech recognition. We demonstrate that our front-end can be combined with a pre-trained ASR system as back-end and supports speech recognition in noisy conditions. Further, we show that training another ASR system with our front-end results in an increased performance of the ASR system in noisy as well as noiseless conditions. The ASR system's performance is especially improved on challenging speech datasets.

Year	DOI	Venue
2021	10.1109/IJCNN52387.2021.9533390	2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
DocType	ISSN	Citations
Conference	2161-4393	0
PageRank	References	Authors
0.34	2	4

Authors (4 rows)

Cited by (0 rows)

References (2 rows)

Name	Order	Citations	PageRank
Matthias Möller	1	1	4.08
Johannes Twiefel	2	12	4.48
Cornelius Weber	3	318	41.92
Stefan Wermter	4	1100	151.62

1