Title
Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification.
Abstract
It is common to use a single speaker independent large Gaussian Mixture Model based Universal Background Model (GMMUBM) as the alternative hypothesis for speaker verification tasks. The speaker models are themselves derived from the UBM using Maximum a Posteriori (MAP) adaptation technique. During verification, log likelihood ratio is calculated between the target model and the GMM-UBM to accept or reject the claimant. The use of a single UBM for different groups of population may not be appropriate especially when the impostors are close to the target speaker. In this paper, we investigate the use of Speaker Cluster-wise UBM (SC-UBM) for a group of target speakers based on two different similarity measures. In the first approach, speakers are grouped into different clusters depending on their Vocal Tract Lengths (VTLs). The group of speakers having same VTL parameter indicates similarity in vocal-tract geometry and constitutes a speaker-dependent characteristic. In the second approach, we use Maximum Likelihood Linear Regression (MLLR) matrices of target speakers to create MLLR super-vectors and use them to cluster speakers into different groups. The SC-UBMs are derived from GMMUBM using MLLR adaptation using data from the corresponding group of target speakers. Finally, speaker dependent models are adapted from their respective SC-UBM using MAP. In the proposed method, log likelihood ratio is calculated between target model and its corresponding SC-UBM. We compare performance of the above method with the single UBM method for varying number of clusters. The experiments are performed on the NIST 2004 SRE core condition and we show that the proposed method with a slight increase in the number of UBMs always outperforms the conventional single GMM-UBM system.
Year
Venue
Field
2010
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP
Speaker verification,Matrix (mathematics),Computer science,Speech recognition,Vocal tract
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Achintya Kumar Sarkar1237.81
Srinivasan Umesh29316.31