Title
Vsr Plus Plus : Improving Visual Semantic Reasoning For Fine-Grained Image-Text Matching
Abstract
Image-text matching has made great progresses recently, but there still remains challenges in fine-grained matching. To deal with this problem, we propose an Improved Visual Semantic Reasoning model (VSR++), which jointly models 1) global alignment between images and texts and 2) local correspondence between regions and words in a unified framework. To exploit their complementary advantages, we also develop a suitable learning strategy to balance their relative importance. As a result, our model can distinguish image regions and text words in a fine-grained level, and thus achieves the current state-of-the-art performance on two benchmark datasets.
Year
DOI
Venue
2020
10.1109/ICPR48806.2021.9413223
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
DocType
ISSN
Citations 
Conference
1051-4651
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Hui Yuan100.34
Yan Huang222627.65
Dongbo Zhang314319.22
Zerui Chen402.37
Wenlong Cheng500.34
Liang Wang612812.87