DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval - Citegraph

Paper Info

Title
DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval

Abstract
Cross-modal retrieval aims to retrieve relevant samples across different media modalities. Existing cross-modal retrieval approaches are contingent on learning common representations of all modalities by assuming that an equal amount of information exists in different modalities. However, since the quantity of information among cross-modal samples is unbalanced and unequal, it is inappropriate to directly match the obtained modality-specific representations across different modalities in a common space. In this paper, we propose a new method called Deep Relational Similarity Learning (DRSL) for cross-modal retrieval. Unlike existing approaches, the proposed DRSL aims to effectively bridge the heterogeneity gap of different modalities by directly learning the natural pairwise similarities instead of explicitly learning a common space. DRSL is a deep hybrid framework that integrates the relation networks module for relation learning, capturing the implicit nonlinear distance metric. To the best of our knowledge, DRSL is the first approach that incorporates relation networks into the cross-modal learning scenario. Comprehensive experimental results show that the proposed DRSL model achieves state-of-the-art results in cross-modal retrieval tasks on four widely-used benchmark datasets, i.e., Wikipedia, Pascal Sentences, NUS-WIDE-10K, and XMediaNet.

Year	DOI	Venue
2021	10.1016/j.ins.2020.08.009	Information Sciences
Keywords	DocType	Volume
Cross-modal retrieval,Relation network,Relational similarity learning,Heterogeneity gap	Journal	546
ISSN	Citations	PageRank
0020-0255	1	0.37
References	Authors
0	4

Authors (4 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xu Wang	1	21	1.97
Peng Hu	2	71	9.06
Liangli Zhen	3	72	9.73
Dezhong Peng	4	285	27.92

1