Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition. - Citegraph

Paper Info

Title
Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition.

Abstract
We explore a method to boost discriminative capabilities of probabilistic linear discriminant analysis PLDA model without losing its generative advantages. We show a sequential projection and training steps leading to a classifier that operates in the original i-vector space but is discriminatively trained in a low-dimensional PLDA latent subspace. We use extended Baum-Welch technique to optimize the model with respect to two objective functions for discriminative training. One of them is the well-known maximum mutual information objective, while the other one is a new objective that we propose to approximate the language detection cost. We evaluate the performance on NIST language recognition evaluation LRE 2015 and our development dataset comprised of the utterances from previous LREs. We improve the detection cost by 10% and 6% relative compared to our fine-tuned generative and discriminative baselines, and by 10% over the best of our previously reported results. The proposed approximation method of the cost function and PLDA subspace training are applicable for a broad range of tasks.

Year	DOI	Venue
2017	10.1109/TASLP.2017.2651377	IEEE/ACM Trans. Audio, Speech & Language Processing
Keywords	Field	DocType
Training,Speech,Speech recognition,Nickel,Optimization,Covariance matrices,Probabilistic logic	Subspace topology,Pattern recognition,Computer science,Speech recognition,NIST,Language identification,Mutual information,Artificial intelligence,Generative grammar,Classifier (linguistics),Discriminative model,Spoken language	Journal
Volume	Issue	ISSN
25	3	2329-9290
Citations	PageRank	References
5	0.45	24
Authors
3

Authors (3 rows)

Cited by (5 rows)

References (24 rows)

Name	Order	Citations	PageRank
Aleksandr Sizov	1	96	4.54
Kong-Aik Lee	2	709	60.64
Tomi Kinnunen	3	1323	86.67

1