Title
Variational multimodal machine translation with underlying semantic alignment
Abstract
Capturing the underlying semantic relationships of sentences is helpful for machine translation. Variational neural machine translation approaches provide an effective way to model the uncertain underlying semantics in languages by introducing latent variables. Multitask learning is applied in multimodal machine translation to integrate multimodal data. However, these approaches usually lack a strong interpretation in utilizing out-of-text information in machine translation tasks. In this paper, we propose a novel architecture-free multimodal translation model, called variational multimodal machine translation (VMMT), under the variational framework which can model the uncertainty in languages caused by ambiguity through utilizing visual and textual information. In addition, the proposed model can eliminate the discrepancy between training and prediction in the existing variational translation models by constructing encoders only relying on source data. More importantly, the proposed multimodal translation model is designed as multitask learning in which the shared semantic representation for different modes is learned and the gap among semantic representation from various modes is reduced by incorporating additional constraints. Moreover, the information bottleneck theory is adopted in our variational encoder–decoder model, which helps the encoder to filter redundancy and the decoder to concentrate on useful information. Experiments on multimodal machine translation demonstrate that the proposed model is competitive.
Year
DOI
Venue
2021
10.1016/j.inffus.2020.11.011
Information Fusion
Keywords
DocType
Volume
Machine translation,Variational neural machine translation,Multimodal learning
Journal
69
ISSN
Citations 
PageRank 
1566-2535
1
0.36
References 
Authors
0
5
Name
Order
Citations
PageRank
ying liu136446.92
Jing Zhao210.36
Shiliang Sun310.36
Huawen Liu410.36
Hao Yang522.74