Empirical evaluation of emotion classification accuracy for non-acted speech - Citegraph

Paper Info

Title
Empirical evaluation of emotion classification accuracy for non-acted speech

Abstract
Emotion recognition is important at the workplace because it impacts a multitude of outcomes, such as performance, engagement and well-being. Emotion recognition from audio is an attractive option due to its non-obtrusive nature and availability of microphones in devices at the workplace. We describe building a classifier that analyzes the para-linguistic features of audio streams to classify them into positive, neutral and negative affect. Since speech at the workplace is different from acted speech, and because it is important that the training data be situated in the right context, we designed and executed an emotion induction procedure to generate a corpus of non-acted speech data of 33 speakers. The corpus was used to train a set of classification models and a comparative analysis of these models was used to choose the feature parameters. Bootstrap aggregation (bagging) was then used on the best combination of algorithm (Random Forest) and features (60 millisecond window size). The resulting classification accuracy of 73% is on par with, or exceeds, accuracies reported in the current literature for non-acted speech for a speaker-dependent test set. For reference, we also report the speaker-dependent recognition accuracy (95%) of the same classifier trained and tested on acted speech for three emotions in the Emo-DB database.

Year	DOI	Venue
2017	10.1109/MMSP.2017.8122261	2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)
Keywords	Field	DocType
classification models,feature parameters,speaker-dependent test set,speaker-dependent recognition accuracy,classifier,acted speech,empirical evaluation,emotion classification accuracy,nonacted speech Emotion recognition,attractive option,para-linguistic features,audio streams,training data,emotion induction procedure,nonacted speech data,Emo-DB database,Bootstrap aggregation,classification accuracy,time 60.0 ms	Pattern recognition,Emotion recognition,Computer science,Emotion classification,Feature extraction,Speech recognition,Bootstrap aggregating,Artificial intelligence,Affect (psychology),Classifier (linguistics),Random forest,Test set	Conference
ISSN	ISBN	Citations
2163-3517	978-1-5090-3650-9	1
PageRank	References	Authors
0.34	17	5

Authors (5 rows)

Cited by (1 rows)

References (17 rows)

Name	Order	Citations	PageRank
Gauri Deshpande	1	1	0.68
Venkata Subramanian Viraraghavan	2	1	4.40
Mayuri Duggirala	3	31	7.47
V. Ramu Reddy	4	53	7.41
Sachin Patel	5	2	2.04

1