Title
Transformer-based Non-Verbal Emotion Recognition: Exploring Model Portability across Speakers' Genders
Abstract
ABSTRACTRecognizing emotions in non-verbal audio tracks requires a deep understanding of their underlying features. Traditional classifiers relying on excitation, prosodic, and vocal traction features are not always capable of effectively generalizing across speakers' genders. In the ComParE 2022 vocalisation sub-challenge we explore the use of a Transformer architecture trained on contrastive audio examples. We leverage augmented data to learn robust non-verbal emotion classifiers. We also investigate the impact of different audio transformations, including neural voice conversion, on the classifier capability to generalize across speakers' genders. The empirical findings indicate that neural voice conversion is beneficial in the pretraining phase, yielding an improved model generality, whereas is harmful at the finetuning stage as hinders model specialization for the task of non-verbal emotion recognition.
Year
DOI
Venue
2022
10.1145/3551876.3554801
International Multimedia Conference
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Lorenzo Vaiani101.01
Alkis Koudounas200.68
Moreno La Quatra301.01
Luca Cagliero400.68
Paolo Garza500.68
Elena Baralis600.34