Title
Use of VTL-wise models in feature-mapping framework to achieve performance of multiple-background models in speaker verification
Abstract
Recently, Multiple Background Models (M-BMs) [1, 2] have been shown to be useful in speaker verification, where the M-BMs are formed based on different Vocal Tract Lengths (VTLs) among the population. The speaker models are adapted from the particular Background Model (BM) corresponding to their VTL. During test, log likelihood ratio of the test utterance is calculated between claimant model and the corresponding BM. In this paper, instead of using different BM for different speaker, we propose the use of single gender, channel and VTL independent UBM (root-UBM) using the concept of VTL dependent mapping function. The pro posed concept is inspired by Feature Mapping (FM) technique used in speaker verification to overcome channel variability. In our pro posed method, VTL specific gender independent Gaussian Mixture models (GMMs) are derived from the root-UBM using Maximum a posteriori (MAP) adaptation. The mapping relation is then learned between the root-UBM and the VTL-specific GMM. During training and testing phase, feature vectors are mapped into root-UBM using the best VTL specific model. Then speaker models are adapted from the root-UBM using mapped features. During test, the log likelihood ratio is calculated between target model and root-UBM. Therefore, unlike M-BM system, there is no need to switch to different BMs depending on the claimant. Another advantage of the proposed method is that other additional normalization/compensation techniques can be easily applied since it is in a single UBM frame-work. The experiments are performed on NIST 2004 SRE core condition, and we show that the performance of the proposed method is close to the M-BM system with and without score normalization.
Year
DOI
Venue
2011
10.1109/ICASSP.2011.5947367
Acoustics, Speech and Signal Processing
Keywords
Field
DocType
Gaussian processes,speaker recognition,FM technique,GMM,Gaussian mixture model,M-BM,MAP,UBM,VTL,VTL-wise model,feature-mapping framework,log likelihood ratio,maximum a posteriori,multiple-background model,speaker verification,vocal tract length,FM,GMM-UBM,Multiple BM,Speaker Verification,VTL-BM
Population,Feature vector,Normalization (statistics),Pattern recognition,Likelihood-ratio test,Computer science,Speech recognition,Speaker recognition,Gaussian process,Artificial intelligence,Maximum a posteriori estimation,Mixture model
Conference
ISSN
ISBN
Citations 
1520-6149 E-ISBN : 978-1-4577-0537-3
978-1-4577-0537-3
1
PageRank 
References 
Authors
0.35
4
2
Name
Order
Citations
PageRank
Achintya Kumar Sarkar1237.81
Srinivasan Umesh29316.31