Title
Learnable MFCCs for Speaker Verification
Abstract
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.
Year
DOI
Venue
2021
10.1109/ISCAS51556.2021.9401593
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS)
Keywords
DocType
ISSN
Speaker verification, feature extraction, mel-frequency cesptral coefficients (MFCCs)
Conference
0271-4302
Citations 
PageRank 
References 
1
0.37
0
Authors
3
Name
Order
Citations
PageRank
Xuechen Liu110.71
Md. Sahidullah232624.99
Tomi Kinnunen3132386.67