Fusion layer attention for image-text matching - Citegraph

Paper Info

Title
Fusion layer attention for image-text matching

Abstract
Image-text matching aims to find the relationship between image and text data and to establish a connection between them. The main challenge of image-text matching is the fact that images and texts have different data distributions and feature representations. Current methods for image-text matching fall into two basic types: methods that map image and text data into a common space and then use distance measurements and methods that treat image-text matching as a classification problem. In both cases, the two data modes used are image and text data. In our method, we create a fusion layer to extract intermediate modes, thus improving the image-text processing results. We also propose a concise way to update the loss function that makes it easier for neural networks to handle difficult problems. The proposed method was verified on the Flickr30K and MS-COCO datasets and achieved superior matching results compared to existing methods.

Year	DOI	Venue
2021	10.1016/j.neucom.2021.01.124	Neurocomputing
Keywords	DocType	Volume
Deep learning,Image-text matching,Multimodal,Retrieval	Journal	442
ISSN	Citations	PageRank
0925-2312	2	0.42
References	Authors
0	8

Authors (8 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Depeng Wang	1	2	0.42
Liejun Wang	2	7	2.86
Shiji Song	3	1247	94.76
Gao Huang	4	875	53.36
Yuchen Guo	5	710	35.96
Shuli Cheng	6	6	7.59
Naixiang Ao	7	2	0.42
Anyu Du	8	4	4.19

1