Title
Comparison of discriminative training methods for speaker verification
Abstract
The maximum likelihood estimation (MLE) and Bayesian maximum a-posteriori (MAP) adaptation methods for Gaussian mixture models (GMM) have proven to be effective and efficient for speaker verification, even though each speaker model is trained using only his own training utterances. Discriminative criteria aim at increasing discriminability by using out-of-class data. In this paper, we consider the speaker verification task using three discriminative training methods to compare performance. Comparisons are discussed for the maximum mutual information (MMI), minimum classification error (MCE) and figure of merit (FOM) criteria. Experiments on the 1996 NIST speaker recognition evaluation data set show that FOM training method outperforms the other two methods for speaker verification in terms of system performance. Meanwhile, logistic regression is investigated and successfully employed as a discriminative score-normalization technique.
Year
DOI
Venue
2003
10.1109/ICASSP.2003.1198749
ICASSP (1)
Keywords
Field
DocType
maximum mutual information,logistic regression,minimum classification error,pattern classification,nist speaker recognition evaluation data set,system performance,figure of merit,discriminability,speaker recognition,speaker model,discriminative training methods,training utterances,normalising,score normalization technique,speaker verification,mutual information,maximum a posteriori estimation,nist,maximum likelihood estimate,logistics,gaussian mixture model,maximum likelihood estimation,bayesian methods
Speaker verification,Pattern recognition,Computer science,Speech recognition,NIST,Speaker recognition,Mutual information,Artificial intelligence,Discriminative model,Logistic regression,Mixture model,Bayesian probability
Conference
Volume
ISSN
ISBN
1
1520-6149
0-7803-7663-3
Citations 
PageRank 
References 
8
0.68
6
Authors
2
Name
Order
Citations
PageRank
Chengyuan Ma111912.00
Eric Chang262549.79