Multi-User Voicefilter-Lite via Attentive Speaker Embedding - Citegraph

Paper Info

Title
Multi-User Voicefilter-Lite via Attentive Speaker Embedding

Abstract
In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass. This is achieved via an attention mechanism on multiple speaker embeddings to compute a single attentive embedding, which is then used as a side input to the model. We implemented multi-user VoiceFilter-Lite and evaluated it for three tasks: (1) a streaming automatic speech recognition (ASR) task; (2) a text-independent speaker verification task; and (3) a personalized keyphrase detection task, where ASR has to detect keyphrases from multiple enrolled users in a noisy environment. Our experiments show that, with up to four enrolled users, multi-user VoiceFilter-Lite is able to significantly reduce speech recognition and speaker verification errors when there is overlapping speech, without affecting performance under other acoustic conditions. This attentive speaker embedding approach can also be easily applied to other speaker-conditioned models such as personal voice activity detection (VAD) and personalized ASR.

Year	DOI	Venue
2021	10.1109/ASRU51503.2021.9687870	2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords	DocType	ISBN
VoiceFilter-Lite,speaker embedding,attention mechanism,keyphrase detection	Conference	978-1-6654-3740-0
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Rajeev Rikhye	1	0	0.34
Quan Wang	2	115	20.15
Qiao Liang	3	77	19.86
Yanzhang He	4	64	16.36
Ian McGraw	5	253	24.41

1