Combining Global and Local Similarity for Cross-Media Retrieval. - Citegraph

Paper Info

Title
Combining Global and Local Similarity for Cross-Media Retrieval.

Abstract
This paper mainly studies the problem of image-text matching in order to make image and text better match. Existing cross-media retrieval methods only make use of the information of image and part of text, that is, matching the whole image with the whole sentence, or matching some image areas with some words. In order to better reveal the potential connection between image and text semantics, this paper proposes a fusion of two levels of similarity across media images-text retrieval method, constructed the cross-media two-level network to explore the better matching between images and texts, it contains two subnets for dealing with global features and local characteristics. Specifically, in this method, the image is divided into the whole picture and some image area, the text is divided into the whole sentences and words, to study respectively, to explore the full potential alignment of images and text, and then use a two-level alignment framework is used to promote each other, fusion of two kinds of similarity can learn to complete representation of cross-media retrieval. Through the experimental evaluation on Flickr30K and MS-COCO datasets, the results show that the method in this paper can make the semantic matching of image and text more accurate, and is superior to the international popular cross-media retrieval method in various evaluation indexes.

Year	DOI	Venue
2020	10.1109/ACCESS.2020.2969808	IEEE ACCESS
Keywords	DocType	Volume
Convolutional neural network,self-attention network,attention mechanism,two-level network,cross-media retrieval	Journal	8
ISSN	Citations	PageRank
2169-3536	1	0.40
References	Authors
0	4

Authors (4 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Zhixin Li	1	111	24.43
Feng Ling	2	1	1.41
Canlong Zhang	3	3	1.11
Huifang Ma	4	290	29.69

1