Title
Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
Abstract
This paper proposes a novel missing-feature reconstruction method to improve speech recognition in background noise environments. The existing missing-feature reconstruction method utilizes log-spectral correlation across frequency bands. In this paper, we propose to employ a temporal spectral feature analysis to improve the missing-feature reconstruction performance by leveraging temporal correlation across neighboring frames. In a similar manner with the conventional method, a Gaussian mixture model is obtained by training over the obtained temporal spectral feature set. The final estimates for missing-feature reconstruction are obtained by a selective combination of the original frequency correlation based method and the proposed temporal correlation-based method. Performance of the proposed method is evaluated on the TIMIT speech corpus using various types of background noise conditions and the CU-Move in-vehicle speech corpus. Experimental results demonstrate that the proposed method is more effective at increasing speech recognition performance in adverse conditions. By employing the proposed temporal-frequency based reconstruction method, a +17.71% average relative improvement in word error rate (WER) is obtained for white, car, speech babble, and background music conditions over 5-, 10-, and 15-dB SNR, compared to the original frequency correlation-based method. We also obtain a +16.72% relative improvement in real-life in-vehicle conditions using data from the CU-Move corpus.
Year
DOI
Venue
2010
10.1109/TASL.2010.2041698
IEEE Transactions on Audio, Speech & Language Processing
Keywords
Field
DocType
novel missing-feature reconstruction method,cu-move in-vehicle speech corpus,timit speech corpus,temporal spectral correlation,speech recognition,noise,reconstruction method,missing feature reconstruction,correlation-based method,missing-feature,background noise,background noise conditions,conventional method,temporal correlation,temporal spectral feature,temporal spectral feature analysis,proposed temporal correlation-based method,gaussian processes,missing-feature reconstruction,robust speech recognition,existing missing-feature reconstruction method,leveraging temporal spectral correlation,background noise condition,gaussian mixture model,correlation methods,word error rate,hidden markov models,speech,correlation,noise measurement,feature analysis
Speech corpus,TIMIT,Speech processing,Background noise,Pattern recognition,Noise measurement,Computer science,Word error rate,Signal-to-noise ratio,Speech recognition,Artificial intelligence,Hidden Markov model
Journal
Volume
Issue
ISSN
18
8
1558-7916
Citations 
PageRank 
References 
3
0.41
25
Authors
2
Name
Order
Citations
PageRank
Wooil Kim112016.95
John H. L. Hansen23215365.75