Title
Performance comparison of multitaper techniques for speaker verification with expressive speech.
Abstract
In this paper, we provide a comparative study of spectral front-end features used as representations for speech signals by processing multitaper magnitude and phase spectra, for speaker verification with expressive speech. In particular, the multitaper modified group delay function (MT-MOGDF) and multitaper magnitude (MT-MAG) spectra of the speech signals are employed to obtain low variance estimates of speech spectra. We observe that the cues that aid in representation of expressive speech are evident in the MT-MOGDF spectrum than the MT-MAG spectrum in terms of mean Formant value and Formant bandwidth. Our extensive experimental study on a speaker verification system with a Gaussian mixture model based universal background model classifier on expressive speech using the IITKGP-SESC and EMODB databases show that MT-MOGDF performs better than MT-MAG technique, in terms of equal error rate and minimum decision cost function. This improvement due to MT-MOGDF is owed to a better representation and a low-variance estimate of the speech spectrum. Our results highlight the utility of MT-MOGDF as a potential alternative for MT-MAG representation for speaker verification problems in general.
Year
DOI
Venue
2018
10.1007/s10772-017-9479-0
I. J. Speech Technology
Keywords
Field
DocType
Multitaper modified group delay function, Multitaper magnitude spectrum, Speaker verification, Expressive speech, IITKGP-SESC database
Speaker verification,Pattern recognition,Multitaper,Computer science,Word error rate,Group delay and phase delay,Speech recognition,Bandwidth (signal processing),Artificial intelligence,Formant,Classifier (linguistics),Mixture model
Journal
Volume
Issue
ISSN
21
3
1381-2416
Citations 
PageRank 
References 
1
0.35
4
Authors
3
Name
Order
Citations
PageRank
Narendra K. C.110.35
R. Kumaraswamy241.91
Gurugopinath, S.3104.65