Speech emotion recognition using hidden Markov models - Citegraph

Paper Info

Title
Speech emotion recognition using hidden Markov models

Abstract
This paper introduces a first approach to emotion recog- nition using RAMSES, the UPC's speech recognition system. The approach is based on standard speech recognition technol- ogy using hidden semi-continuous Markov models. Both the selection of low level features and the design of the recognition system are addressed. Results are given on speaker dependent emotion recognition using the Spanish corpus of INTERFACE Emotional Speech Synthesis Database. The accuracy recognis- ing seven different emotions—the six ones defined in MPEG-4 plus neutral style—exceeds 80% using the best combination of low level features and HMM structure. This result is very sim- ilar to that obtained with the same database in subjective evalu- ation by human judges. Dealing with the speaker's emotion is one of the latest chal- lenges in speech technologies. Three different aspects can be easily identified: speech recognition in the presence of emotional speech, synthesis of emotional speech, and emotion recognition. In this last case, the objective is to determine the emotional state of the speaker out of the speech samples. Pos- sible applications include from help to psychiatric diagnosis to intelligent toys, and is a subject of recent but rapidly growing interest (1). This paper describes the TALP researchers first approach to emotion recognition. The work is inserted in the scope of the INTERFACE project (2). The objective of this European Commission sponsored project is "to define new models and implement advanced tools for audio-video analysis, synthesis and representation in order to provide essential technologies for the implementation of large-scale virtual and augmented envi- ronments. The work is oriented to make man-machine interac- tion as natural as possible, based on everyday human commu- nication by speech, facial expressions and body gestures." In the field of emotion recognition out of speech, the main goal of the INTERFACE project will be the construction of a real-time multi-lingual speaker independent emotion recog- niser. For this purpose, large speech databases with recordings from many speakers and languages are needed. As these re- sources are not available yet, a reduced problem will be ad- dressed first: emotion recognition in multi-speaker language de- pendent conditions. Namely, this paper deals with the recogni- tion of emotion for two Spanish speakers using standard hidden Markov models technology.

Year	Venue	Keywords
2001	INTERSPEECH	real time,markov model,facial expression,speech synthesis,speech recognition,hidden markov model
Field	DocType	Citations
Speech synthesis,Recognition system,Markov model,Computer science,Emotion recognition,Speech recognition,Speaker recognition,Natural language processing,Artificial intelligence,Hidden Markov model	Conference	62
PageRank	References	Authors
5.56	5	4

Authors (4 rows)

Cited by (62 rows)

References (5 rows)

Name	Order	Citations	PageRank
Albino Nogueiras	1	139	15.27
Asunción Moreno	2	399	44.97
Antonio Bonafonte	3	693	64.80
José B. Mariño	4	510	64.66

1