Title
The Role of Attention Mechanism and Multi-Feature in Image Captioning
Abstract
Up to now, caption generation is still a hard problem in artificial intelligence where a textual description must be generated for a given image. This problem combines both computer vision and natural language processing. Generally, the CNN - RNN is a popular architecture in image captioning. Currently, there are many variants of this architecture, where the attention mechanism is an important discovery. Recently, deep learning methods have achieved state-of-the-art results for this problem. In this paper, we present a model that generates natural language descriptions of given images. Our approach uses the pre-trained deep neural network models to extract visual features and then applies an LSTM to generate captions. We use BLEU scores to evaluate our model performance on Flickr8k and Flickr30k dataset. In addition, we carried out a comparison between the approaches without attention mechanism and attention-based mechanism.
Year
DOI
Venue
2019
10.1145/3310986.3311002
Proceedings of the 3rd International Conference on Machine Learning and Soft Computing
Keywords
DocType
ISBN
CNN, Image captioning, LSTM, RNN, attention mechanism
Conference
978-1-4503-6612-0
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Tien X. Dang100.34
Aran Oh200.68
In Seop Na34213.83
Soo Hyung Kim4296.39