Title
An Unsupervised frame Selection Technique for Robust Emotion Recognition in Noisy Speech.
Abstract
Automatic emotion recognition with good accuracy has been demonstrated for clean speech, but the performance deteriorates quickly when speech is contaminated with noise. In this paper, we propose a front-end voice activity detector (VAD)-based unsupervised method to select the frames with a relatively better signal to noise ratio (SNR) in the spoken utterances. Then we extract a large number of statistical features from low-level audio descriptors for the purpose of emotion recognition by using state-of-art classifiers. Extensive experimentation on two standard databases contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 different SNR levels (0; 5; 10; 15; 20 dB) have been carried out. While performing all experiments to classify emotions both at the categorical and the dimensional spaces, the proposed technique outperforms a Recurrent Neural Network (RNN)-based VAD across all 5 types and levels of noises, and for both the databases.
Year
DOI
Venue
2018
10.23919/EUSIPCO.2018.8553202
European Signal Processing Conference
Keywords
Field
DocType
Speech emotion,Noisy speech,Voice activity detector,Emotion recognition
Noise measurement,Computer science,Categorical variable,Emotion recognition,Signal-to-noise ratio,Recurrent neural network,Feature extraction,Speech recognition,Detector
Conference
ISSN
Citations 
PageRank 
2076-1465
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Meghna Pandharipande173.61
Rupayan Chakraborty2158.21
Ashish Panda345.31
Sunil Kumar Kopparapu44225.18