Abstract | ||
---|---|---|
Multimodal Sentiment Analysis (MSA) is a challenging research area that studies sentiment expressed from multiple heterogeneous modalities. Given those pre-trained language models such as BERT have shown state-of-the-art (SOTA) performance in multiple NLP disciplines, existing models tend to integrate these modalities into BERT and treat the MSA as a single prediction task. However, we find that simply fusing the multimodal features into BERT cannot well establish the power of a strong pre-trained model. Besides, the classification ability of each modality is also suppressed by single-task learning. In this paper, we proposes a multimodal framework named Two-Phase Multi-task Sentiment Analysis (TPMSA). It applies a two-phase training strategy to make the most of the pre-trained model and a novel multi-task learning strategy to investigate the classification ability of each representation. We conducted experiments on two multimodal benchmark datasets, CMU-MOSI and CMU-MOSEI. The results show that our TPMSA model outperforms the current SOTA method on both datasets across most of the metrics, clearly showing our proposed method's effectiveness. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/TASLP.2022.3178204 | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING |
Keywords | DocType | Volume |
Bit error rate, Task analysis, Multitasking, Visualization, Sentiment analysis, Training, Transformers, BERT, Multimodal sentiment analysis, multi-task | Journal | 30 |
ISSN | Citations | PageRank |
2329-9290 | 0 | 0.34 |
References | Authors | |
8 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bo Yang | 1 | 903 | 100.69 |
Lijun Wu | 2 | 124 | 21.21 |
Jinhua Zhu | 3 | 0 | 5.07 |
Bo Shao | 4 | 2 | 4.13 |
Xiaola Lin | 5 | 1099 | 78.09 |
Tie-yan Liu | 6 | 4662 | 256.32 |