Seq2seq Attentional Siamese Neural Networks For Text-Dependent Speaker Verification - Citegraph

Paper Info

Title
Seq2seq Attentional Siamese Neural Networks For Text-Dependent Speaker Verification

Abstract
In this paper, we present a Sequence-to-Sequence Attentional Siamese Neural Network ( Seq2Seq-ASNN) that leverages temporal alignment information for end-to-end speaker verification. In prior works of speaker discriminative neural networks, utterance-level evaluation/enrollment speaker representations are usually calculated. Our proposed model, utilizing a sequence-to-sequence ( Seq2Seq) attention mechanism, maps the frame-level evaluation representation into enrollment feature domain and further generates an utterance-level evaluation-enrollment joint vector for final similarity measure. Feature learning, attention mechanism, and metric learning are jointly optimized using an end-to-end loss function. Experimental results show that our proposed model outperforms various baseline methods, including the traditional i-Vector/PLDA method, multi-enrollment end-to-end speaker verification models, d-vector approaches, and a self attention model, for text-dependent speaker verification on a Tencent internal voice wake-up dataset.

Year	DOI	Venue
2019	10.1109/icassp.2019.8682676	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords	Field	DocType
End-to-end speaker verification, text-dependent, Siamese neural networks, Seq2Seq attention	Speaker verification,Similarity measure,Pattern recognition,Computer science,Attention model,Feature extraction,Artificial intelligence,Artificial neural network,Discriminative model,Feature learning	Conference
ISSN	Citations	PageRank
1520-6149	2	0.35
References	Authors
0	6

Authors (6 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yichi Zhang	1	2	0.35
Meng Yu	2	524	66.52
Na Li	3	37	23.63
Chengzhu Yu	4	16	3.77
Jia Cui	5	6	2.80
Dong Yu	6	6264	475.73

1