Title | ||
---|---|---|
Image-Text Retrieval via Contrastive Learning with Auxiliary Generative Features and Support-set Regularization |
Abstract | ||
---|---|---|
In this paper, we bridge the heterogeneity gap between different modalities and improve image-text retrieval by taking advantage of auxiliary image-to-text and text-to-image generative features with contrastive learning. Concretely, contrastive learning is devised to narrow the distance between the aligned image-text pairs and push apart the distance between the unaligned pairs from both inter- and intra-modality perspectives with the help of cross-modal retrieval features and auxiliary generative features. In addition, we devise a support-set regularization term to further improve contrastive learning by constraining the distance between each image/text and its corresponding cross-modal support-set information contained in the same semantic category. To evaluate the effectiveness of the proposed method, we conduct experiments on three benchmark datasets (i.e., MIRFLICKR-25K, NUS-WIDE, MS COCO). Experimental results show that our model significantly outperforms the strong baselines for cross-modal image-text retrieval. For reproducibility, we submit the code and data publicly at: \urlhttps://github.com/Hambaobao/CRCGS. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3477495.3531783 | SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Keywords | DocType | Citations |
Cross-modal image-text retrieval, Contrastive learning, Support-set regularization, Generative features | Conference | 0 |
PageRank | References | Authors |
0.34 | 6 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lei Zhang | 1 | 0 | 0.34 |
Min Yang | 2 | 77 | 20.41 |
Chengming Li | 3 | 0 | 1.35 |
Xu Ruifeng | 4 | 432 | 53.04 |