Title
Speech enhancement by joint statistical characterization in the Log Gabor Wavelet domain
Abstract
In speech enhancement, Bayesian Marginal models cannot explain the inter-scale statistical dependencies of different wavelet scales. Simple non-linear estimators for wavelet-based denoising assume that the wavelet coefficients in different scales are independent in nature. However, wavelet coefficients have significant inter-scale dependencies. This paper introduces a new method that uses the inter-scale dependency between the coefficients and their parents by a Circularly Symmetric Probability Density Function (CS-PDF) related to the family of Spherically Invariant Random Processes (SIRPs) in Log Gabor Wavelet (LGW) domain and corresponding joint shrinkage estimators are derived by Maximum a Posteriori (MAP) estimation theory. The proposed work presents two different joint shrinkage estimators. In first, the inter-scale variance of LGW coefficients is kept constant which gives a closed form solution. In second, a relatively more complex approach is presented where variance is not constrained to be constant. It is also shown that the proposed methods show better performance when speech uncertainty is taken into consideration. The robustness of the proposed frameworks are tested on 50 speakers of POLYCOST and YOHO speech corpus in four different noisy environments against four established speech enhancement algorithms. Experimental results show that the proposed estimators yield a higher improvement in Segmental SNR (S-SNR) and also lower Log Spectral Distortion (LSD) compared to other estimators. In the second evaluation, the proposed speech enhancement techniques are found to give more robust Digit Recognition in noisy conditions on the AURORA 2.0 speech corpus compared to competing methods.
Year
DOI
Venue
2008
10.1016/j.specom.2008.03.004
Speech Communication
Keywords
Field
DocType
speech corpus,speech recognition,speech enhancement,proposed estimator,circularly symmetric probability density function,proposed speech enhancement technique,wavelet coefficient,joint statistical characterization,log gabor wavelet,yoho speech corpus,proposed framework,log gabor wavelet domain,speech uncertainty,bayesian bivariate estimator,established speech enhancement algorithm,spherically invariant random processes,probability density function,signal analysis,signal to noise ratio,closed form solution,symmetric function,noise reduction,gabor wavelets,segmentation,background noise,signal processing,marginal models,stochastic process,wavelets,speech processing,robustness,algorithm
Speech corpus,Speech enhancement,Speech processing,Pattern recognition,Gabor wavelet,Speech recognition,Artificial intelligence,Maximum a posteriori estimation,Estimation theory,Mathematics,Wavelet,Estimator
Journal
Volume
Issue
ISSN
50
6
Speech Communication
Citations 
PageRank 
References 
3
0.38
15
Authors
3
Name
Order
Citations
PageRank
Suman Senapati141.41
Sandipan Chakroborty2313.34
Goutam Saha325523.17