Abstract | ||
---|---|---|
To recognize emotional traits on speech is a challenging task which became very popular in the past years, especially due to the recent advances in deep neural networks. Although very successful, these models inherited a common problem from strongly supervised deep neural networks: a large number of strongly labeled samples demands necessary, so the model learns a general emotion representation. This paper proposes a solution for this problem with the development of a semi-supervised neural network which can learn speech representation from unlabeled samples and used them in different emotion recognition in speech scenarios. We provide experiments with different datasets, representing natural and controlled scenarios. Our results show that our model is competitive with state-of-the-art solutions in all these scenarios while sharing the same learned representations, which were learned without the necessity of strong labeled data. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/978-3-030-01418-6_77 | ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I |
Keywords | Field | DocType |
Emotion recognition, Semi-supervised learning, GAN, Speech representation, Deep learning | Semi-supervised learning,Computer science,Emotion recognition,Artificial intelligence,Deep learning,Labeled data,Artificial neural network,Deep neural networks,Machine learning | Conference |
Volume | ISSN | Citations |
11139 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 6 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ingryd Pereira | 1 | 0 | 0.34 |
Diego Santos | 2 | 0 | 0.34 |
Alexandre M. A. Maciel | 3 | 4 | 5.43 |
Pablo V. A. Barros | 4 | 119 | 22.02 |