Abstract | ||
---|---|---|
In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/ICASSP43922.2022.9747185 | IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xuechen Liu | 1 | 0 | 0.34 |
Md. Sahidullah | 2 | 326 | 24.99 |
Tomi Kinnunen | 3 | 1323 | 86.67 |