Title
Computationally efficient speaker identification using fast-MLLR based anchor modeling
Abstract
In this paper, we propose a computationally efficient method to identify a speaker from a large population of speakers. The proposed method is based on our earlier [1] Fast Maximum Likelihood Linear Linear Regression (MLLR) anchor modeling technique which provides performance comparable to the conventional anchor modeling system and yet reduces computation time significantly by computing likelihood efficiently using sufficient statistics of data and anchor specific MLLR matrix. However, both these systems still require a Gaussian Mixture Model-Universal Background Model (GMM-UBM) based back-end system to choose the optimal speaker, which is computationally heavy. In our proposed method, we show that applying Linear-Discriminant Analysis (LDA) and Within-Class-Covariance Normalization (WCCN) on the Speaker characterization Vector (SCV) of our recently proposed Fast-MLLR method, we can combine the computational efficiency and the discriminant capability to have a system that uses simple cosine-distance measure to identify speakers and yet has significantly superior performance compared to both full-blown GMM-UBM system and the anchor-model system. More importantly, there is no need of the “back-end” system. Experimental result on NIST 2004 SRE shows that the proposed method reduces identification error rate by an absolute 2% and takes only 2/3 of the time taken by efficient Fast-MLLR system and only 20% of the time taken by the stand-alone GMM-UBM system.
Year
DOI
Venue
2012
10.1109/ICASSP.2012.6288884
Acoustics, Speech and Signal Processing
Keywords
Field
DocType
Gaussian processes,covariance matrices,maximum likelihood detection,regression analysis,speaker recognition,Gaussian mixture model-universal background model,LDA,SCV,WCCN,anchor specific MLLR matrix,back-end system,computationally efficient speaker identification,data specific MLLR matrix,fast-MLLR based anchor modeling,fast-maximum likelihood linear linear regression anchor modeling technique,full-blown GMM-UBM system,linear-discriminant analysis,optimal speaker,speaker characterization vector,within-class-covariance normalization,Fast MLLR,LDA,WCCN,anchor model,speaker identification
Population,Normalization (statistics),Pattern recognition,Anchor modeling,Computer science,Word error rate,Gaussian,NIST,Speaker recognition,Gaussian process,Artificial intelligence
Conference
ISSN
ISBN
Citations 
1520-6149 E-ISBN : 978-1-4673-0044-5
978-1-4673-0044-5
2
PageRank 
References 
Authors
0.39
9
3
Name
Order
Citations
PageRank
Achintya Kumar Sarkar120.39
Srinivasan Umesh220.39
Jean-François Bonastre36410.60