Title
Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech From Existing Podcast Recordings
Abstract
The lack of a large, natural emotional database is one of the key barriers to translate results on speech emotion recognition in controlled conditions into real-life applications. Collecting emotional databases is expensive and time demanding, which limits the size of existing corpora. Current approaches used to collect spontaneous databases tend to provide unbalanced emotional content, which is dictated by the given recording protocol (e.g., positive for colloquial conversations, negative for discussion or debates). The size and speaker diversity are also limited. This paper proposes a novel approach to effectively build a large, naturalistic emotional database with balanced emotional content, reduced cost and reduced manual labor. It relies on existing spontaneous recordings obtained from audio-sharing websites. The proposed approach combines machine learning algorithms to retrieve recordings conveying balanced emotional content with a cost effective annotation process using crowdsourcing, which make it possible to build a large scale speech emotional database. This approach provides natural emotional renditions from multiple speakers, with different channel conditions and conveying balanced emotional content that are difficult to obtain with alternative data collection protocols.
Year
DOI
Venue
2019
10.1109/TAFFC.2017.2736999
IEEE Transactions on Affective Computing
Keywords
DocType
Volume
Databases,Speech,Speech recognition,Digital audio broadcasting,Speech processing,Emotion recognition,Machine learning algorithms
Journal
10
Issue
ISSN
Citations 
4
1949-3045
13
PageRank 
References 
Authors
0.57
23
2
Name
Order
Citations
PageRank
Reza Lotfian1302.65
Carlos Busso2161693.04