Title
Bi-modality Fusion for Emotion Recognition in the Wild.
Abstract
The emotion recognition in the wild has been a hot research topic in the field of affective computing. Though some progresses have been achieved, the emotion recognition in the wild is still an unsolved problem due to the challenge of head movement, face deformation, illumination variation etc. To deal with these unconstrained challenges, we propose a bi-modality fusion method for video based emotion recognition in the wild. The proposed framework takes advantages of the visual information from facial expression sequences and the speech information from audio. The state-of-the-art CNN based object recognition models are employed to facilitate the facial expression recognition performance. A bi-direction long short term Memory (Bi-LSTM) is employed to capture dynamic information of the learned features. Additionally, to take full advantages of the facial expression information, the VGG16 network is trained on AffectNet dataset to learn a specialized facial expression recognition model. On the other hand, the audio based features, like low level descriptor (LLD) and deep features obtained by spectrogram image, are also developed to improve the emotion recognition performance. The best experimental result shows that the overall accuracy of our algorithm on the Test dataset of the EmotiW challenge is 62.78, which outperforms the best result of EmotiW2018 and ranks 2nd at the EmotiW2019 challenge.
Year
DOI
Venue
2019
10.1145/3340555.3355719
ICMI
Keywords
Field
DocType
Emotion Recognition, Deep Learning, Convolutional Neural Networks
Computer science,Emotion recognition,Fusion,Human–computer interaction
Conference
ISBN
Citations 
PageRank 
978-1-4503-6860-5
2
0.35
References 
Authors
0
8
Name
Order
Citations
PageRank
Sunan Li120.35
Wenming Zheng2124080.70
Yuan Zong316217.39
Cheng Lu4526.33
Chuangao Tang5284.25
Xingxun Jiang631.09
Jiateng Liu731.76
Wanchuang Xia820.35