An Unsupervised frame Selection Technique for Robust Emotion Recognition in Noisy Speech. - Citegraph

Paper Info

Title
An Unsupervised frame Selection Technique for Robust Emotion Recognition in Noisy Speech.

Abstract
Automatic emotion recognition with good accuracy has been demonstrated for clean speech, but the performance deteriorates quickly when speech is contaminated with noise. In this paper, we propose a front-end voice activity detector (VAD)-based unsupervised method to select the frames with a relatively better signal to noise ratio (SNR) in the spoken utterances. Then we extract a large number of statistical features from low-level audio descriptors for the purpose of emotion recognition by using state-of-art classifiers. Extensive experimentation on two standard databases contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 different SNR levels (0; 5; 10; 15; 20 dB) have been carried out. While performing all experiments to classify emotions both at the categorical and the dimensional spaces, the proposed technique outperforms a Recurrent Neural Network (RNN)-based VAD across all 5 types and levels of noises, and for both the databases.

Year	DOI	Venue
2018	10.23919/EUSIPCO.2018.8553202	European Signal Processing Conference
Keywords	Field	DocType
Speech emotion,Noisy speech,Voice activity detector,Emotion recognition	Noise measurement,Computer science,Categorical variable,Emotion recognition,Signal-to-noise ratio,Recurrent neural network,Feature extraction,Speech recognition,Detector	Conference
ISSN	Citations	PageRank
2076-1465	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Meghna Pandharipande	1	7	3.61
Rupayan Chakraborty	2	15	8.21
Ashish Panda	3	4	5.31
Sunil Kumar Kopparapu	4	42	25.18

1