Optimizing Multi-Taper Features for Deep Speaker Verification - Citegraph

Paper Info

Title
Optimizing Multi-Taper Features for Deep Speaker Verification

Abstract
Multi-taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs). Even if past work has reported promising automatic speaker verification (ASV) results with Gaussian mixture model-based classifiers, the performance of multi-taper MFCCs with deep ASV systems remains an open question. Instead of a static-taper design, we propose to optimize the multi-taper estimator jointly with a deep neural network trained for ASV tasks. With a maximum improvement on the SITW corpus of 25.8% in terms of equal error rate over the static-taper, our method helps preserve a balanced level of leakage and variance, providing more robustness.

Year	DOI	Venue
2021	10.1109/LSP.2021.3122796	IEEE SIGNAL PROCESSING LETTERS
Keywords	DocType	Volume
Feature extraction, Discrete Fourier transforms, Task analysis, Neural networks, Mel frequency cepstral coefficient, Stochastic processes, Standards, Multi-taper spectrum, speaker verification	Journal	28
Issue	ISSN	Citations
1	1070-9908	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xuechen Liu	1	0	0.34
Md. Sahidullah	2	326	24.99
Tomi Kinnunen	3	1323	86.67

1