Abstract | ||
---|---|---|
As speech based emotion recognition has matured to a degree where it becomes applicable within real-life conditions, it is time for a realistic view on obtainable performances. Most state-of-the-art emotion recognition methods are based on turn- and frame-level analysis independent of phonetic transcription. True speaker disjoint partitioning of training and test sets is still less common than simple cross-validation. Even speaker disjoint experiments can give only little insight into the generalization ability of modern emotion recognition engines since training and test sets used for system development usually tend to be similar as far as acoustic channel, noise overlay, and language are concerned. A considerably more realistic impression can be gathered by cross-corpora evaluation. Tuning of the emotion classification engine (feature set optimization and normalization, selection of a classification technique and corresponding parameter configuration) is an important issue of realistic evaluations. In the ideal case, an optimal classifier configuration estimated on training data should provide an outstanding recognition performance on unseen data. We therefore compare cross-corpora classification performances of optimized and non-optimized general and phonetic-pattern dependent classifiers. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ACII.2013.81 | ACII |
Keywords | Field | DocType |
emotion classification engine,emotion recognition,parameter optimization issues,cross-corpora emotion classification,outstanding recognition performance,cross-corpora classification performance,realistic view,classification technique,state-of-the-art emotion recognition method,realistic evaluation,realistic impression,modern emotion recognition engine,speaker recognition | Disjoint sets,Normalization (statistics),Phonetic transcription,Communication,Computer science,Emotion classification,Speaker recognition,Artificial intelligence,Classifier (linguistics),Communication channel,Emotion perception,Speech recognition,Machine learning | Conference |
ISSN | Citations | PageRank |
2156-8103 | 2 | 0.37 |
References | Authors | |
18 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bogdan Vlasenko | 1 | 235 | 12.72 |
David Philippou-Hübner | 2 | 35 | 2.48 |
Andreas Wendemuth | 3 | 451 | 41.74 |