Title
Orthogonalization-Guided Feature Fusion Network For Multimodal 2d+3d Facial Expression Recognition
Abstract
As 2D and 3D data present different views of the same face, the features extracted from them can he both complementary and redundant. In this paper, we present a novel and efficient orthogonalization-guided feature fusion network, namely OGF(2)Net, to fuse the features extracted from 2D and 3D faces for facial expression recognition. While 2D texture maps are fed into a 2D feature extraction pipeline (FE2DNet), the attribute maps generated from 3D data are concatenated as input of the 3D feature extraction pipeline (FE3DNet). The two networks are separately trained at the first stage and frozen in the second stage for late feature fusion, which can well address the unavailability of a large number of 3D+2D face pairs. To reduce the redundancies among features extracted from 2D and 3D streams, we design an orthogonal loss-guided feature fusion network to orthogonalize the features before fusing them. Experimental results show that the proposed method significantly outperforms the state-of-the-art algorithms on both the BU-3DFE and Bosphorus databases. While accuracies as high as 89.05% (P1 protocol) and 89.07% (P2 protocol) are achieved on the BU-3DFE database, an accuracy of 89.28% is achieved on the Bosphorus database. The complexity analysis also suggests that our approach achieves a higher processing speed while simultaneously requiring lower memory costs.
Year
DOI
Venue
2021
10.1109/TMM.2020.3001497
IEEE TRANSACTIONS ON MULTIMEDIA
Keywords
DocType
Volume
Multimodal facial expression recognition, feature fusion
Journal
23
ISSN
Citations 
PageRank 
1520-9210
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Shisong Lin101.69
Mengchao Bai200.34
Feng Liu31059.27
Linlin Shen4135190.25
Yicong Zhou51822108.83