Learnable MFCCs for Speaker Verification - Citegraph

Paper Info

Title
Learnable MFCCs for Speaker Verification

Abstract
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.

Year	DOI	Venue
2021	10.1109/ISCAS51556.2021.9401593	2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS)
Keywords	DocType	ISSN
Speaker verification, feature extraction, mel-frequency cesptral coefficients (MFCCs)	Conference	0271-4302
Citations	PageRank	References
1	0.37	0
Authors
3

Authors (3 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xuechen Liu	1	1	0.71
Md. Sahidullah	2	326	24.99
Tomi Kinnunen	3	1323	86.67

1