Title
Speech Emotion Recognition method using time-stretching in the Preprocessing Phase and Artificial Neural Network Classifiers
Abstract
Human emotions are playing a significant role in the understanding of human behaviour. There are multiple ways of recognizing human emotions, and one of them is through human speech. This paper aims to present an approach for designing a Speech Emotion Recognition (SER) system for an industrial training station. While assembling a product, the end user emotions can be monitored and used as a parameter for adapting the training station. The proposed method is using a phase vocoder for time-stretching and an Artificial Neural Network (ANN) for classification of five typical different emotions. As input for the ANN classifier, features like Mel Frequency Cepstral Coefficients (MFCCs), short-term energy, zero-crossing rate, pitch and the speech rate were extracted. The proposed method was evaluated on the Ryerson Audio-Visual Database of Emotion Speech and Song (RAVDESS) and shows promising results when compared to other methods such as zero-padding.
Year
DOI
Venue
2020
10.1109/ICCP51029.2020.9266265
2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)
Keywords
DocType
ISSN
Speech Emotion Recognition,Mel Frequency Cepstral Coefficients,Audio Time-Stretching,ANN
Conference
2065-9946
ISBN
Citations 
PageRank 
978-1-7281-9081-5
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Valentin-Catalin Govoreanu100.34
Mihai Neghina200.34