Title | ||
---|---|---|
An Unsupervised frame Selection Technique for Robust Emotion Recognition in Noisy Speech. |
Abstract | ||
---|---|---|
Automatic emotion recognition with good accuracy has been demonstrated for clean speech, but the performance deteriorates quickly when speech is contaminated with noise. In this paper, we propose a front-end voice activity detector (VAD)-based unsupervised method to select the frames with a relatively better signal to noise ratio (SNR) in the spoken utterances. Then we extract a large number of statistical features from low-level audio descriptors for the purpose of emotion recognition by using state-of-art classifiers. Extensive experimentation on two standard databases contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 different SNR levels (0; 5; 10; 15; 20 dB) have been carried out. While performing all experiments to classify emotions both at the categorical and the dimensional spaces, the proposed technique outperforms a Recurrent Neural Network (RNN)-based VAD across all 5 types and levels of noises, and for both the databases. |
Year | DOI | Venue |
---|---|---|
2018 | 10.23919/EUSIPCO.2018.8553202 | European Signal Processing Conference |
Keywords | Field | DocType |
Speech emotion,Noisy speech,Voice activity detector,Emotion recognition | Noise measurement,Computer science,Categorical variable,Emotion recognition,Signal-to-noise ratio,Recurrent neural network,Feature extraction,Speech recognition,Detector | Conference |
ISSN | Citations | PageRank |
2076-1465 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Meghna Pandharipande | 1 | 7 | 3.61 |
Rupayan Chakraborty | 2 | 15 | 8.21 |
Ashish Panda | 3 | 4 | 5.31 |
Sunil Kumar Kopparapu | 4 | 42 | 25.18 |