Tied Variational Autoencoder Backends For I-Vector Speaker Recognition - Citegraph

Paper Info

Title
Tied Variational Autoencoder Backends For I-Vector Speaker Recognition

Abstract
Probabilistic linear discriminant analysis (PLDA) is the de facto standard for backends in i-vector speaker recognition. If we try to extend the PLDA paradigm using non-linear models, e.g., deep neural networks, the posterior distributions of the latent variables and the marginal likelihood become intractable. In this paper, we propose to approach this problem using stochastic gradient variational Bayes. We generalize the PLDA model to let i-vectors depend non-linearly on the latent factors. We approximate the evidence lower bound (ELBO) by Monte Carlo sampling using the reparametrization trick. This enables us to optimize of the ELBO using backpropagation to jointly estimate the parameters that define the model and the approximate posteriors of the latent factors. We also present a reformulation of the likelihood ratio, which we call Q-scoring. Q-scoring makes possible to efficiently score the speaker verification trials for this model. Experimental results on NIST SRE10 suggest that more data might be required to exploit the potential of this method.

Year	DOI	Venue
2017	10.21437/Interspeech.2017-1018	18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords	Field	DocType
speaker recognition, i-vectors, variational autoencoders, stochastic variational inference, PLDA	I vector,Autoencoder,Pattern recognition,Computer science,Speech recognition,Speaker recognition,Artificial intelligence	Conference
ISSN	Citations	PageRank
2308-457X	1	0.35
References	Authors
5	3

Authors (3 rows)

Cited by (1 rows)

References (5 rows)

Name	Order	Citations	PageRank
jesus villalba	1	41	5.11
Niko Brümmer	2	595	44.01
N. Dehak	3	1269	92.64

1