Attention Based Double Layer LSTM for Chinese Image Captioning - Citegraph

Paper Info

Title
Attention Based Double Layer LSTM for Chinese Image Captioning

Abstract
Image captioning is a difficult problem which aims to automatically describe the content of an image using proper textual descriptions. Especially for Chinese captioning, due to its complex semantic and various expressions, it is still a challenging task of improving the caption quality. In this paper, we extend research on automated image captioning in the dimension of language and propose a novel Chinese image captioning model which uses the double-layer LSTM with an attention mechanism to generate more natural Chinese sentences. In our model, the Inception-v3 network is employed to extract image features, and the weights of these features are determined by the attention mechanism based on double-layer LSTM to predict each word. Experimental results on AIC-ICC datasets demonstrate that our proposed model can generate better Chinese captions, which are more accurate and fluent. Compared with traditional Chinese image captioning algorithms, our method greatly improves the performance of captioning and achieves BLEU-4 and CIDEr evaluation scores of 40.2 and 119.9, respectively. The actual generation results also show that the model can generate accurate, diverse, and vivid Chinese caption of images.

Year	DOI	Venue
2021	10.1109/IJCNN52387.2021.9533463	2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
Keywords	DocType	ISSN
Chinese image captioning, attention model, LSTM	Conference	2161-4393
Citations	PageRank	References
0	0.34	0
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Wu Wei	1	204	14.84
Deshuai Sun	2	0	0.34

1