SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning. - Citegraph

Paper Info

Title
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning.

Abstract
Answering complex questions about images is an ambitious goal for machine intelligence, which requires a joint understanding of images, text, and commonsense knowledge, as well as a strong reasoning ability. Recently, multimodal Transformers have made a great progress in the task of Visual Commonsense Reasoning (VCR), by jointly understanding visual objects and text tokens through layers of cross-modality attention. However, these approaches do not utilize the rich structure of the scene and the interactions between objects which are essential in answering complex commonsense questions. We propose aScene Graph Enhanced Image-Text Learning (SGEITL) framework to incorporate visual scene graph in commonsense reasoning. In order to exploit the scene graph structure, at the model structure level, we propose a multihop graph transformer for regularizing attention interaction among hops. As for pre-training, a scene-graph-aware pre-training method is proposed to leverage structure knowledge extracted in visual scene graph. Moreover, we introduce a method to train and generate domain relevant visual scene graph using textual annotations in a weakly-supervised manner. Extensive experiments on VCR and other tasks show significant performance boost compared with the state-of-the-art methods, and prove the efficacy of each proposed component.

Year	Venue	Keywords
2022	AAAI Conference on Artificial Intelligence	Knowledge Representation And Reasoning (KRR),Computer Vision (CV),Machine Learning (ML),Cognitive Modeling & Cognitive Systems (CMS)
DocType	ISSN	Citations
Conference	AAAI 2022	0
PageRank	References	Authors
0.34	0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Zhecan Wang	1	20	2.74
Haoxuan You	2	37	4.87
Liunian Harold Li	3	0	2.03
Zareian, Alireza	4	7	4.20
Suji Park	5	0	0.34
Yiqing Liang	6	0	0.68
Kai-Wei Chang	7	0	0.68
Shih-Fu Chang	8	13015	1101.53

1