Title | ||
---|---|---|
Computationally efficient speaker identification using fast-MLLR based anchor modeling |
Abstract | ||
---|---|---|
In this paper, we propose a computationally efficient method to identify a speaker from a large population of speakers. The proposed method is based on our earlier [1] Fast Maximum Likelihood Linear Linear Regression (MLLR) anchor modeling technique which provides performance comparable to the conventional anchor modeling system and yet reduces computation time significantly by computing likelihood efficiently using sufficient statistics of data and anchor specific MLLR matrix. However, both these systems still require a Gaussian Mixture Model-Universal Background Model (GMM-UBM) based back-end system to choose the optimal speaker, which is computationally heavy. In our proposed method, we show that applying Linear-Discriminant Analysis (LDA) and Within-Class-Covariance Normalization (WCCN) on the Speaker characterization Vector (SCV) of our recently proposed Fast-MLLR method, we can combine the computational efficiency and the discriminant capability to have a system that uses simple cosine-distance measure to identify speakers and yet has significantly superior performance compared to both full-blown GMM-UBM system and the anchor-model system. More importantly, there is no need of the “back-end” system. Experimental result on NIST 2004 SRE shows that the proposed method reduces identification error rate by an absolute 2% and takes only 2/3 of the time taken by efficient Fast-MLLR system and only 20% of the time taken by the stand-alone GMM-UBM system. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/ICASSP.2012.6288884 | Acoustics, Speech and Signal Processing |
Keywords | Field | DocType |
Gaussian processes,covariance matrices,maximum likelihood detection,regression analysis,speaker recognition,Gaussian mixture model-universal background model,LDA,SCV,WCCN,anchor specific MLLR matrix,back-end system,computationally efficient speaker identification,data specific MLLR matrix,fast-MLLR based anchor modeling,fast-maximum likelihood linear linear regression anchor modeling technique,full-blown GMM-UBM system,linear-discriminant analysis,optimal speaker,speaker characterization vector,within-class-covariance normalization,Fast MLLR,LDA,WCCN,anchor model,speaker identification | Population,Normalization (statistics),Pattern recognition,Anchor modeling,Computer science,Word error rate,Gaussian,NIST,Speaker recognition,Gaussian process,Artificial intelligence | Conference |
ISSN | ISBN | Citations |
1520-6149 E-ISBN : 978-1-4673-0044-5 | 978-1-4673-0044-5 | 2 |
PageRank | References | Authors |
0.39 | 9 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Achintya Kumar Sarkar | 1 | 2 | 0.39 |
Srinivasan Umesh | 2 | 2 | 0.39 |
Jean-François Bonastre | 3 | 64 | 10.60 |