Title
Guiding Long-Short Term Memory for Image Caption Generation
Abstract
In this work we focus on the problem of image caption generation. We propose an extension of the long short term memory (LSTM) model, which we coin gLSTM for short. In particular, we add semantic information extracted from the image as extra input to each unit of the LSTM block, with the aim of guiding the model towards solutions that are more tightly coupled to the image content. Additionally, we explore different length normalization strategies for beam search in order to prevent from favoring short sentences. On various benchmark datasets such as Flickr8K, Flickr30K and MS COCO, we obtain results that are on par with or even outperform the current state-of-the-art.
Year
Venue
Field
2015
CoRR
Normalization (statistics),Pattern recognition,Computer science,Image content,Long short term memory,Beam search,Semantic information,Artificial intelligence
DocType
Volume
Citations 
Journal
abs/1509.04942
11
PageRank 
References 
Authors
0.61
23
4
Name
Order
Citations
PageRank
Xu Jia133320.97
efstratios gavves265533.41
Basura Fernando377535.60
Tinne Tuytelaars410161609.66