Title
Misperceptions Of The Emotional Content Of Natural And Vocoded Speech In A Car
Abstract
This paper analyzes a) how often listeners interpret the emotional content of an utterance incorrectly when listening to vocoded or natural speech in adverse conditions; b) which noise conditions cause the most misperceptions; and c) which group of listeners misinterpret emotions the most. The long-term goal is to construct new emotional speech synthesizers that adapt to the environment and to the listener. We performed a large-scale listening test where over 400 listeners between the ages of 21 and 72 assessed natural and vocoded acted emotional speech stimuli. The stimuli had been artificially degraded using a room impulse response recorded in a car and various in-car noise types recorded in a real car. Experimental results show that the recognition rates for emotions and perceived emotional strength degrade as signal-to-noise ratio decreases. Interestingly, misperceptions seem to be more pronounced for negative and low arousal emotions such as calmness or anger, while positive emotions such as happiness appear to be more robust to noise. An ANOVA analysis of listener meta-data further revealed that gender and age also influenced results, with elderly male listeners most likely to incorrectly identify emotions.
Year
DOI
Venue
2017
10.21437/Interspeech.2017-532
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords
Field
DocType
emotional perception, speech in noise, emotion recognition, car noise
Computer science,Speech recognition
Conference
ISSN
Citations 
PageRank 
2308-457X
1
0.37
References 
Authors
3
4
Name
Order
Citations
PageRank
Jaime Lorenzo-Trueba1469.26
Cassia Valentini-Botinhao220818.41
Gustav Eje Henter33711.40
junichi yamagishi41906145.51