Title
Speech Enhancement Using Non-Negative Spectrogram Models With Mel-Generalized Cepstral Regularization
Abstract
Spectral domain speech enhancement algorithms based on non negative spectrogram models such as non-negative matrix factorization (NMF) and non-negative matrix factor deconvolution are powerful in terms of signal recovery accuracy, however they do not directly lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. We have previously proposed a method that makes it possible to enhance speech in the spectral and cepstral domains simultaneously. Although this method was shown to be effective, the devised algorithm was computationally demanding. This paper proposes yet another formulation that allows for a fast implementation by replacing the regularization term with a divergence measure between the NMF model and the mel-generalized cepstral (MGC) representation of the target spectrum. Since the MGC is an auditory-motivated representation of an audio signal widely used in parametric speech synthesis, we also expect the proposed method to have an effect in enhancing the perceived quality. Experimental results revealed the effectiveness of the proposed method in terms of both the signal-to-distortion ratio and the cepstral distance.
Year
DOI
Venue
2017
10.21437/Interspeech.2017-1492
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords
Field
DocType
speech enhancement, non-negative matrix factorization, mel-generalized cepstral representation, single channel signal processing
Speech enhancement,Pattern recognition,Computer science,Spectrogram,Cepstrum,Speech recognition,Regularization (mathematics),Artificial intelligence
Conference
ISSN
Citations 
PageRank 
2308-457X
0
0.34
References 
Authors
2
4
Name
Order
Citations
PageRank
Li Li1581109.68
Hirokazu Kameoka280179.06
Tomoki Toda31874167.18
S. Makino41736189.21