Abstract | ||
---|---|---|
In speech enhancement, the optimal minimum mean square error (MMSE) short-time spectral amplitude estimator requires knowledge about the probability density functions of speech and noise in the short-time Fourier transform domain, for every signal-to-noise-ratio (SNR). However, both of these quantities are unknown and are usually non-stationary in real-world scenario. To tackle this problem, this paper proposes a speech enhancement approach based on a set of Neural Networks of which each Neural Network is developed particularly for a critical band and a predefined SNR. In this speech enhancement approach, the Neural Network simulates a set of gain functions which attempts to match human hearing and optimises a particular SNR. The Neural Networks are trained using one speech signal contaminated with pink noise. The trained Neural Networks are evaluated using a test set consisting of 28 noisy speech signals. The speech enhancement results are compared to a state of the art MMSE based speech enhancement technique in terms of four speech quality metrics namely noise reduction ratio (NRR), intelligibility frequency weighted segmental SNR (IFWSNRseg), perceptual evaluation of speech quality (PESO) and short-time objective intelligibility (STOI). Through the evaluation, the effectiveness of the Neural Networks can be observed. |
Year | Venue | Field |
---|---|---|
2017 | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference | Speech enhancement,Noise measurement,Computer science,Pink noise,Signal-to-noise ratio,Minimum mean square error,Speech recognition,Artificial neural network,PESQ,Intelligibility (communication) |
DocType | ISSN | Citations |
Conference | 2309-9402 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Pei Chee Yong | 1 | 36 | 5.79 |
Kit Yan Chan | 2 | 470 | 45.36 |
Sven Nordholm | 3 | 405 | 62.82 |