Title
A Multiple Positives Enhanced NCE Loss for Image-Text Retrieval
Abstract
Image-Text Retrieval (ITR) enables users to retrieve relevant contents from different modalities and has attracted considerable attention. Existing approaches typically utilize contrastive loss functions to conduct contrastive learning in the common embedding space, where they aim at pulling semantically related pairs closer while pushing away unrelated pairs. However, we argue that this behaviour is too strict: these approaches neglect to address the inherent misalignments from potential semantically related samples. For example, it commonly exists more than one positive samples in the current batch for a given query and previous methods enforce them apart even if they are semantically related, which leads to a sub-optimal and contradictory optimization direction and then decreases the retrieval performance. In this paper, a Multiple Positives Enhanced Noise Contrastive Estimation learning objective is proposed to alleviate the diversion noise by leveraging and optimizing multiple positive pairs overall for each sample in a mini-batch. We demonstrate the effectiveness of our approach on MS-COCO and Flickr3OK datasets for image-to-text and text-to-image retrieval.
Year
DOI
Venue
2022
10.1007/978-3-030-98358-1_34
MULTIMEDIA MODELING (MMM 2022), PT I
Keywords
DocType
Volume
Image-text retrieval, Contrastive learning, Noise contrastive estimation
Conference
13141
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Yi Li15013.29
Dehao Wu200.34
Zhu Yuesheng311239.21