Title
Vc-Vqa: Visual Calibration Mechanism For Visual Question Answering
Abstract
Visual Question Answering (VQA) is a comprehensive task to answer questions about the visual contents of an image. Recently, a number of studies have pointed out that VQA models tend to be misled by the dataset biases, and rely heavily on the superficial correlations between question and answer, rather than really understanding the visual contents. To address this issue, we propose visual calibration mechanism for VQA(VC-VQA) which extends the conventional VQA model with an additional image feature reconstruction module. The proposed model reconstructs image features based on predicted answer with question and measures the similarity between reconstructed image feature and original image feature, which will guide the VQA model predict the final answer. We evaluate our model on both VQA v1 and VQA v2 datasets, showing that VC-VQA effectively reduces impacts of dataset bias and achieves competitive performance compared to other mainstream methods.
Year
DOI
Venue
2020
10.1109/ICIP40778.2020.9190828
2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)
Keywords
DocType
ISSN
Visual Question Answering, Dataset Bias, Visual Calibration, Feature Reconstruction
Conference
1522-4880
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Yanyuan Qiao161.44
Zheng Yu200.34
Jing Liu3178188.09