Abstract | ||
---|---|---|
Attention mechanisms have attracted considerable interest in image captioning because of its powerful performance. Existing attention-based models use feedback information from the caption generator as guidance to determine which of the image features should be attended to. A common defect of these attention generation methods is that they lack a higher-level guiding information from the image itself, which sets a limit on selecting the most informative image features. Therefore, in this paper, we propose a novel attention mechanism, called topic-guided attention, which integrates image topics in the attention model as a guiding information to help select the most important image features. Moreover, we extract image features and image topics with separate networks, which can be fine-tuned jointly in an end-to-end manner during training. The experimental results on the benchmark Microsoft COCO dataset show that our method yields state-of-art performance on various quantitative metrics. |
Year | Venue | Keywords |
---|---|---|
2018 | 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | Image captioning, Attention, Topic, Attribute, Deep Neural Network |
DocType | Volume | ISSN |
Conference | abs/1807.03514 | 1522-4880 |
Citations | PageRank | References |
0 | 0.34 | 9 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhihao Zhu | 1 | 3 | 5.13 |
Zhan Xue | 2 | 0 | 0.34 |
Zejian Yuan | 3 | 614 | 37.37 |