Title | ||
---|---|---|
The Weighted Cross-Modal Attention Mechanism With Sentiment Prediction Auxiliary Task for Multimodal Sentiment Analysis |
Abstract | ||
---|---|---|
Human brain extracts the spatial and temporal semantic information by processing the multi-modalities, which has contextually meaningful for perceiving and understanding the emotional state of an individual. However, there are two main challenges in modeling multimodal sequences: 1) the different sampling rates from multimodal data make the cross-modal interactions very difficult; 2) how to efficiently fuse unimodal representations and effectively capture relationships among multimodal data. In this paper, we design the weighted cross-modal attention mechanism, which not only captures the temporal correlation information and the spatial dependence information of each modality, but also dynamically adjusts the weight of each modality across different time steps. And the unimodal subtasks are led for assisting the representation learning of specific modality to jointly train the multimodal tasks and unimodal subtasks to explore the complementary relationships of each modality. Our model gets a new state-of-the-art record on the CMU-MOSI dataset and brings noticeable performance improvements on all the metrics. For the CMU-MOSEI dataset, the F1 score of the binary classification, the 7-class task, and the regression task of our model are still the highest among all models and the proposed model is only lower than the multimodal split attention fusion (MSAF) model with aligned data on the accuracy of the binary classification, showing the great performance of the suggested method. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/TASLP.2022.3192728 | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING |
Keywords | DocType | Volume |
Task analysis, Data models, Acoustics, Logic gates, Visualization, Sentiment analysis, Representation learning, Cross-modal attention, joint optimization, multimodal sentiment analysis, multi-task learning, multimodal fusion | Journal | 30 |
Issue | ISSN | Citations |
1 | 2329-9290 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Qiupu Chen | 1 | 0 | 0.34 |
Guimin Huang | 2 | 6 | 9.26 |
Yabing Wang | 3 | 1 | 1.37 |