Abstract | ||
---|---|---|
We propose a novel Patched Multi-Condition Training (pMCT) method for robust Automatic Speech Recognition (ASR). pMCT employs Multi-condition Audio Modification and Patching (MAMP) via mixing {\it patches} of the same utterance extracted from clean and distorted speech. Training using patch-modified signals improves robustness of models in noisy reverberant scenarios. Our proposed pMCT is evaluated on the LibriSpeech dataset showing improvement over using vanilla Multi-Condition Training (MCT). For analyses on robust ASR, we employed pMCT on the VOiCES dataset which is a noisy reverberant dataset created using utterances from LibriSpeech. In the analyses, pMCT achieves 23.1% relative WER reduction compared to the MCT. |
Year | DOI | Venue |
---|---|---|
2022 | 10.21437/INTERSPEECH.2022-117 | Conference of the International Speech Communication Association (INTERSPEECH) |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Pablo Peso Parada | 1 | 0 | 0.34 |
Agnieszka Dobrowolska | 2 | 0 | 0.68 |
Karthikeyan Saravanan | 3 | 0 | 0.68 |
Mete Ozay | 4 | 0 | 1.35 |