Title | ||
---|---|---|
Minimum Word Error Training For Non-Autoregressive Transformer-Based Code-Switching ASR. |
Abstract | ||
---|---|---|
Non-autoregressive end-to-end ASR framework might be potentially appropriate for code-switching recognition task thanks to its inherent property that present output token being independent of historical ones. However, it still under-performs the state-of-the-art autoregressive ASR frameworks. In this paper, we propose various approaches to boosting the performance of a CTC-mask-based nonautoregressive Transformer under code-switching ASR scenario. To begin with, we attempt diversified masking method that are closely related with code-switching point, yielding an improved baseline model. More importantly, we employ MinimumWord Error (MWE) criterion to train the model. One of the challenges is how to generate a diversified hypothetical space, so as to obtain the average loss for a given ground truth. To address such a challenge, we explore different approaches to yielding desired N-best-based hypothetical space. We demonstrate the efficacy of the proposed methods on SEAME corpus, a challenging English-Mandarin code-switching corpus for Southeast Asia community. Compared with the crossentropy-trained strong baseline, the proposed MWE training method achieves consistent performance improvement on the test sets. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/ICASSP43922.2022.9746830 | IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yizhou Peng | 1 | 0 | 0.68 |
Jicheng Zhang | 2 | 0 | 1.01 |
Haihua Xu | 3 | 55 | 11.41 |
Hao Huang | 4 | 589 | 104.49 |
Eng Siong Chng | 5 | 970 | 106.33 |