Title | ||
---|---|---|
Locally-Connected And Convolutional Neural Networks For Small Footprint Speaker Recognition |
Abstract | ||
---|---|---|
This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation. |
Year | Venue | Field |
---|---|---|
2015 | 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | Pattern recognition,Convolutional neural network,Computer science,Speech recognition,Speaker recognition,Footprint,Artificial intelligence,Deep learning |
DocType | Citations | PageRank |
Conference | 6 | 0.49 |
References | Authors | |
6 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yu-hsin Chen | 1 | 6 | 0.49 |
Ignacio Lopez-Moreno | 2 | 187 | 14.97 |
Tara N. Sainath | 3 | 3497 | 232.43 |
Mirkó Visontai | 4 | 321 | 23.62 |
Raziel Álvarez | 5 | 30 | 3.84 |
Carolina Parada | 6 | 242 | 13.11 |