Title | ||
---|---|---|
Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions |
Abstract | ||
---|---|---|
This paper proposes a novel missing-feature reconstruction method to improve speech recognition in background noise environments. The existing missing-feature reconstruction method utilizes log-spectral correlation across frequency bands. In this paper, we propose to employ a temporal spectral feature analysis to improve the missing-feature reconstruction performance by leveraging temporal correlation across neighboring frames. In a similar manner with the conventional method, a Gaussian mixture model is obtained by training over the obtained temporal spectral feature set. The final estimates for missing-feature reconstruction are obtained by a selective combination of the original frequency correlation based method and the proposed temporal correlation-based method. Performance of the proposed method is evaluated on the TIMIT speech corpus using various types of background noise conditions and the CU-Move in-vehicle speech corpus. Experimental results demonstrate that the proposed method is more effective at increasing speech recognition performance in adverse conditions. By employing the proposed temporal-frequency based reconstruction method, a +17.71% average relative improvement in word error rate (WER) is obtained for white, car, speech babble, and background music conditions over 5-, 10-, and 15-dB SNR, compared to the original frequency correlation-based method. We also obtain a +16.72% relative improvement in real-life in-vehicle conditions using data from the CU-Move corpus. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1109/TASL.2010.2041698 | IEEE Transactions on Audio, Speech & Language Processing |
Keywords | Field | DocType |
novel missing-feature reconstruction method,cu-move in-vehicle speech corpus,timit speech corpus,temporal spectral correlation,speech recognition,noise,reconstruction method,missing feature reconstruction,correlation-based method,missing-feature,background noise,background noise conditions,conventional method,temporal correlation,temporal spectral feature,temporal spectral feature analysis,proposed temporal correlation-based method,gaussian processes,missing-feature reconstruction,robust speech recognition,existing missing-feature reconstruction method,leveraging temporal spectral correlation,background noise condition,gaussian mixture model,correlation methods,word error rate,hidden markov models,speech,correlation,noise measurement,feature analysis | Speech corpus,TIMIT,Speech processing,Background noise,Pattern recognition,Noise measurement,Computer science,Word error rate,Signal-to-noise ratio,Speech recognition,Artificial intelligence,Hidden Markov model | Journal |
Volume | Issue | ISSN |
18 | 8 | 1558-7916 |
Citations | PageRank | References |
3 | 0.41 | 25 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wooil Kim | 1 | 120 | 16.95 |
John H. L. Hansen | 2 | 3215 | 365.75 |