Dynamic interaction networks for image-text multimodal learning - Citegraph

Paper Info

Title
Dynamic interaction networks for image-text multimodal learning

Abstract
Recently, there is a surge of interest in image-text multimodal representation learning, and many neural network based models have been proposed aiming to capture the interaction between two modalities with different forms of functions. Despite their success, a potential limitation of these methods is insufficient to model all kinds of interactions with a set of static parameters. To alleviate this problem, we present a dynamic interaction network, in which the parameters of the interaction function are dynamically generated by a meta network. Additionally, to provide necessary multimodal features that the meta network needs, we propose a new neural module called Multimodal Transformer. Experimentally, we not only make a comprehensively quantitative evaluation on four image-text tasks, but also show some interpretable analyses of our models, revealing the internal working mechanism of the dynamic parameter learning.

Year	DOI	Venue
2020	10.1016/j.neucom.2019.10.103	Neurocomputing
Keywords	DocType	Volume
Multimodal learning,Dynamic parameters prediction,Deep neural networks	Journal	379
ISSN	Citations	PageRank
0925-2312	1	0.36
References	Authors
0	4

Authors (4 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Wenshan Wang	1	24	9.00
Pengfei Liu	2	58	7.83
Su Yang	3	110	14.58
Weishan Zhang	4	396	52.57

1