Title
Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments.
Abstract
This paper evaluates the robustness of a DNN-HMM-based speech recognition system in highly-reverberant real environments using the HRRE database. The performance of locally-normalized filter bank (LNFB) and Mel filter bank (MelFB) features in combination with Non-negative Matrix Factorization (NMF), Suppression of Slowly-varying components and the Falling edge (SSF) and Weighted Prediction Error (WPE) enhancement methods are discussed and evaluated. Two training conditions were considered: clean and reverberated (Reverb). With Reverb training the use of WPE and LNFB provides WERs that are 3% and 20% lower in average than SSF and NMF, respectively. WPE and MelFB provides WERs that are 11% and 24% lower in average than SSF and NMF, respectively. With clean training, which represents a significant mismatch between testing and training conditions, LNFB features clearly outperform MelFB features. The results show that different types of training, parametrization, and enhancement techniques may work better for a specific combination of speaker-microphone distance and reverberation time. This suggests that there could be some degree of complementarity between systems trained with different enhancement and parametrization methods.
Year
Venue
Field
2018
arXiv: Audio and Speech Processing
Complementarity (molecular biology),Reverberation,Parametrization,Computer science,Filter bank,Matrix decomposition,Robustness (computer science),Speech recognition,Non-negative matrix factorization,Signal edge
DocType
Volume
Citations 
Journal
abs/1803.09013
0
PageRank 
References 
Authors
0.34
0
7
Name
Order
Citations
PageRank
José Novoa1103.92
Juan Pablo Escudero212.04
Jorge Wuth3124.79
Víctor Poblete4102.30
Simon King5195.11
Richard M. Stern61663406.79
Néstor Becerra Yoma75018.84