A Two-Stage Spatiotemporal Attention Convolution Network For Continuous Dimensional Emotion Recognition From Facial Video - Citegraph

Paper Info

Title
A Two-Stage Spatiotemporal Attention Convolution Network For Continuous Dimensional Emotion Recognition From Facial Video

Abstract
Continuous dimensional emotion recognition for facial video sequence is a crucial and challenging task in Affective Computing and Human-Computer Intelligent Interaction. The key of this task is to effectively extract and discriminate spatial-temporal features in a more fine-grained way. In this paper, a Two-Stage Spatiotemporal Attention Temporal Convolution Network (TS-SATCN) is designed for continuous dimensional emotion recognition of facial videos. The first stage generates an initial recognition result that is later fed into the second for correction. In each stage, the introduced spatiotemporal attention branch helps the network learn different attention levels and focuses on the informative spatial-temporal features adaptively. The network is trained by a proposed smooth loss function which can further improve the predictions' quality. Extensive experiments are performed on two datasets, RECOLA and AFEW-VA, which shows that the proposed method achieves significant improvement over state-of-the-art methods.

Year	DOI	Venue
2021	10.1109/LSP.2021.3063609	IEEE SIGNAL PROCESSING LETTERS
Keywords	DocType	Volume
Feature extraction, Convolution, Emotion recognition, Spatiotemporal phenomena, Faces, Task analysis, Stacking, Continuous emotion recognition, spatiotemporal attention, TCN	Journal	28
ISSN	Citations	PageRank
1070-9908	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Min Hu	1	31	12.64
Qian Chu	2	0	0.34
Xiaohua Wang	3	1	2.12
Lei He	4	21	4.75
Fuji Ren	5	803	135.33

1