Title | ||
---|---|---|
Orthogonal Gradient Penalty For Fast Training Of Wasserstein Gan Based Multi-Task Autoencoder Toward Robust Speech Recognition |
Abstract | ||
---|---|---|
Performance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1587/transinf.2019EDL8183 | IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS |
Keywords | DocType | Volume |
speech enhancement, generative adversarial networks, deep learning, robust speech recognition | Journal | E103D |
Issue | ISSN | Citations |
5 | 1745-1361 | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chao-Yuan Kao | 1 | 0 | 0.34 |
SangWook Park | 2 | 0 | 4.06 |
Alzahra Badi | 3 | 0 | 0.34 |
David K. Han | 4 | 216 | 27.96 |
Hanseok Ko | 5 | 421 | 80.24 |