Title | ||
---|---|---|
Emotion recognition improvement using normalized formant supplementary features by hybrid of DTW-MLP-GMM model. |
Abstract | ||
---|---|---|
In recent four decades, enormous efforts have been focused on developing automatic speech recognition systems to extract linguistic information, but much research is needed to decode the paralinguistic information such as speaking styles and emotion. The effect of using first three normalized formant frequencies and pitch frequency as supplementary features on improving the performance of an emotion recognition system that uses Mel-frequency cepstral coefficients and energy-related features, as the components of feature vector, is investigated in this paper. The normalization is performed using a dynamic time warping-multi-layer perceptron hybrid model after determining the frequency range that is most affected by emotion. To reduce the number of features, fast correlation-based filter and analysis of variations (ANOVA) methods are used in this study. Recognizing of the emotional states is performed using Gaussian mixture model. Experimental results show that first formant (F1)-based warping and ANOVA-based feature selection result in the best performance as compared to other simulated systems in this study, and the average emotion recognition accuracy is acceptable as compared to most of the recent researches in this field. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1007/s00521-012-0884-7 | Neural Computing and Applications |
Keywords | DocType | Volume |
Emotion recognition, Formant normalization, DTW, MLP, GMM | Journal | 22 |
Issue | ISSN | Citations |
6 | 1433-3058 | 9 |
PageRank | References | Authors |
0.41 | 35 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Davood Gharavian | 1 | 117 | 10.06 |
Mansour Sheikhan | 2 | 297 | 20.38 |
Farhad Ashoftedel | 3 | 20 | 1.23 |