Title | ||
---|---|---|
Deep learning with maximal figure-of-merit cost to advance multi-label speech attribute detection |
Abstract | ||
---|---|---|
In this work, we are interested in boosting speech attribute detection by formulating it as a multi-label classification task, and deep neural networks (DNNs) are used to design speech attribute detectors. A straightforward way to tackle the speech attribute detection task is to estimate DNN parameters using the mean squared error (MSE) loss function and employ a sigmoid function in the DNN output nodes. A more principled way is nonetheless to incorporate the micro-F1 measure, which is a widely used metric in the multi-label classification, into the DNN loss function to directly improve the metric of interest at training time. Micro-F1 is not differentiable, yet we overcome such a problem by casting our task under the maximal figure-of-merit (MFoM) learning framework. The results demonstrate that our MFoM approach consistently outperforms the baseline systems. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/SLT.2016.7846308 | 2016 IEEE Spoken Language Technology Workshop (SLT) |
Keywords | Field | DocType |
Speech articulatory attributes detection,deep neural networks,convolutional neural networks,maximal figure-of-merit,foreign accent recognition | Computer science,Mean squared error,Figure of merit,Differentiable function,Artificial intelligence,Deep learning,Artificial neural network,Sigmoid function,Pattern recognition,Speech recognition,Boosting (machine learning),Hidden Markov model,Machine learning | Conference |
ISSN | ISBN | Citations |
2639-5479 | 978-1-5090-4904-2 | 0 |
PageRank | References | Authors |
0.34 | 9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ivan Kukanov | 1 | 3 | 2.40 |
Ville Hautamäki | 2 | 385 | 33.51 |
Sabato Marco Siniscalchi | 3 | 310 | 30.21 |
Kehuang Li | 4 | 57 | 7.61 |