Title
Thorough evaluation of TIMIT database speaker identification performance under noise with and without the G.712 type handset.
Abstract
In this work, a speaker identification system is proposed which employs two feature extraction models, namely: the power normalized cepstral coefficients and the mel frequency cepstral coefficients. Both features are subjected to acoustic modeling using a Gaussian mixture model–universal background model. The purpose of this work is to provide a thorough evaluation of the effect of different types of noise on the speaker identification accuracy (SIA) and thereby providing benchmark figures for future comparative studies. In particular, the additive white Gaussian noise and eight non-stationary noise types (with and without the G.712 type handset) corresponding to various signal to noise ratios are tested. Fusion strategies are also employed using late fusion methods: maximum, weighted sum, and mean fusion. The measurements of randomly selected 120 speakers from the TIMIT database are employed and the SIA is used to measure the system performance. The weighted sum fusion resulted in the best performance in terms of SIA with noisy speech. The proposed model given in this work and its related analysis paves the way for further work in this important area.
Year
DOI
Venue
2019
10.1007/s10772-019-09630-9
International Journal of Speech Technology
Keywords
Field
DocType
Speaker identification, TIMIT-database, Stationary and non-stationary background noise, G.712 type handset
Mel-frequency cepstrum,Speaker identification,Normalization (statistics),Pattern recognition,Computer science,Signal-to-noise ratio,Speech recognition,Feature extraction,Gaussian,Artificial intelligence,Handset,Additive white Gaussian noise
Journal
Volume
Issue
ISSN
22
3
1381-2416
Citations 
PageRank 
References 
0
0.34
0
Authors
4