Abstract | ||
---|---|---|
Automatic speech recognition (ASR) systems are of vital importance nowadays in commonplace tasks such as speech-to-text processing and language translation. This created the need for an ASR system that can operate in realistic crowded environments. Thus, speech enhancement is a valuable building block in ASR systems and other applications such as hearing aids, smartphones and teleconferencing systems. In this paper, a generative adversarial network (GAN) based framework is investigated for the task of speech enhancement, more specifically speech denoising of audio tracks. A new architecture based on CasNet generator and an additional feature-based loss are incorporated to get realistically denoised speech phonetics. Finally, the proposed framework is shown to outperform other learning and traditional model-based speech enhancement approaches. |
Year | DOI | Venue |
---|---|---|
2020 | 10.23919/Eusipco47968.2020.9287606 | 2020 28th European Signal Processing Conference (EUSIPCO) |
Keywords | DocType | ISSN |
Speech enhancement,generative adversarial networks,automatic speech recognition,deep learning | Conference | 2219-5491 |
ISBN | Citations | PageRank |
978-1-7281-5001-7 | 0 | 0.34 |
References | Authors | |
8 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sherif Abdulatif | 1 | 0 | 0.34 |
Karim Armanious | 2 | 0 | 0.34 |
Karim Guirguis | 3 | 0 | 0.34 |
Jayasankar T. Sajeev | 4 | 0 | 0.34 |
Bin Yang | 5 | 201 | 49.22 |