Title
Image Caption Generator with Novel Object Injection
Abstract
Image captioning is a field within artificial intelligence that is progressing rapidly and it has a lot of potentials. A major problem when working in this field is the limited amount of data that is available to us as is. The only dataset considered suitable enough for the task is the Microsoft: Common Objects in Context (MSCOCO) dataset, which contains about 120,000 training images. This covers about 80 object classes, which is an insufficient amount if we want to create robust solutions that aren't limited to the constraints of the data at hand. In order to overcome this problem, we propose a solution that incorporates Zero-Shot Learning concepts in order to identify unknown objects and classes by using semantic word embeddings and existing state-of-the-art object identification algorithms. Our proposed model, Image Captioning using Novel Word Injection, uses a pre-trained caption generator and works on the output of the generator to inject objects that are not present in the dataset into the caption. We evaluate the model on standardized metrics, namely, BLEU, CIDEr and ROUGE-L. The results, qualitatively and quantitatively, outperform the underlying model.
Year
DOI
Venue
2018
10.1109/DICTA.2018.8615810
2018 Digital Image Computing: Techniques and Applications (DICTA)
Keywords
Field
DocType
Image Caption,Microsoft Common Objects in Context (MSCOCO),Convolutional Neural Network,Recurrent Neural Network
Closed captioning,Pattern recognition,Computer science,Convolutional neural network,As is,Recurrent neural network,Artificial intelligence
Conference
ISBN
Citations 
PageRank 
978-1-5386-6603-6
0
0.34
References 
Authors
4
5