Title
Weighted Gradient Pretrain for Low-Resource Speech Emotion Recognition
Abstract
To alleviate the problem of the dependency on the quantity of the training sample data in speech emotion recognition, a weighted gradient pre-train algorithm for low-resource speech emotion recognition is proposed. Multiple public emotion corpora are used for pre-training to generate shared hidden layer (SHL) parameters with the generalization ability. The parameters are used to initialize the downsteam network of the recognition task for the low-resource dataset, thereby improving the recognition performance on low-resource emotion corpora. However, the emotion categories are different among the public corpora, and the number of samples varies greatly, which will increase the difficulty of joint training on multiple emotion datasets. To this end, a weighted gradient (WG) algorithm is proposed to enable the shared layer to learn the generalized representation of different datasets without affecting the priority of the emotion recognition on each corpus. Experiments show that the accuracy is improved by using CASIA, IEMOCAP, and eNTERFACE as the known datasets to pre-train the emotion models of GEMEP, and the performance could be improved further by combining WG with gradient reversal layer.
Year
DOI
Venue
2022
10.1587/transinf.2022EDL8014
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
DocType
Volume
speech emotion recognition, shared hidden layer, weighted gradient, low-resource
Journal
E105D
Issue
ISSN
Citations 
7
1745-1361
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Yue Xie1113.59
Ruiyu Liang23513.15
Xiaoyan Zhao300.34
Zhenlin Liang4101.53
Jing Du500.34