Abstract | ||
---|---|---|
A blind bandwidth extension is presented which improves the perceived quality of 4 kHz speech by artificially extending the speech's frequency range to 8 kHz. Based on the source-filter model of the human speech production, the speech signal is decomposed into spectral envelope and excitation signal and each of them is extrapolated separately. With this decomposition, good perceptual quality can be achieved while keeping the computational complexity low. The focus of this work is in the generation of an excitation signal with and autoregressive model that calculates a distribution for each audio sample conditioned on previous samples. This is achieved with a deep neural network following the architecture of LPCNet [1].A listening test shows that it significantly improves the perceived quality of bandlimited speech. The system has an algorithmic delay of 30 ms and can be applied in state-of-the-art speech and audio codecs. |
Year | DOI | Venue |
---|---|---|
2020 | 10.23919/Eusipco47968.2020.9287465 | 2020 28th European Signal Processing Conference (EUSIPCO) |
Keywords | DocType | ISSN |
bandwidth extension,artificial bandwidth expansion,speech enhancement,audio super resolution,speech super resolution | Conference | 2219-5491 |
ISBN | Citations | PageRank |
978-1-7281-5001-7 | 0 | 0.34 |
References | Authors | |
0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Konstantin Schmidt | 1 | 17 | 2.21 |
Bernd Edler | 2 | 83 | 16.76 |