Title
An Empirical Study on Ensemble Learning of Multimodal Machine Translation
Abstract
With the increasing availability of images, multimodal machine translation (MMT) is leading a vibrant field. Model structure and multimodal information introduction are the hotspot focused by MMT researchers nowadays. Among the existing models, transformer model has reached the state-of-the-art performance in many translation tasks. However, we observe that the performance of MMT based on transformer is highly unstable since transformer model is sensitive to the fluctuation of hyper-parameters especially the number of layers, the dimension of word embeddings and hidden states, the number of multi-heads. Moreover, different ways of introducing image information also have significant influence on the performance of MMT. In this paper, we exploit some integration strategies which depend on different tasks to make collaborative decisions on the final translation results to enhance the stability of MMT based on transformer. Furthermore, we combine different ways of introducing image information to improve the semantic expression of input. Extensive experiments on Multi30K dataset demonstrate that ensemble learning in MMT which integrates text and image features exactly obtain more stable and better translation performance and the best result yields improvement of 5.12 BLEU points over the strong Transformer baseline set in our experiments.
Year
DOI
Venue
2020
10.1109/BigMM50055.2020.00019
2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM)
Keywords
DocType
ISBN
Multimodal machine translation,Transformer,Ensemble learning,Synonym-replacing,Deep learning
Conference
978-1-7281-9326-7
Citations 
PageRank 
References 
0
0.34
9
Authors
7
Name
Order
Citations
PageRank
Liang Tan100.34
Lin Li222531.97
Yifeng Han300.34
Dong Li447567.20
Kaixi Hu511.03
Dong Zhou6278.01
Peipei Wang721.75