Multi-decoder Based Co-attention for Image Captioning. - Citegraph

Paper Info

Title
Multi-decoder Based Co-attention for Image Captioning.

Abstract
Recently image caption has gained increasing attention in artificial intelligence. Existing image captioning models typically adopt visual mechanism only once to capture the related region maps, which is difficult to attend the regions relevant to each generated word effectively. In this paper, we propose a novel multi-decoder based co-attention framework for image captioning, which is composed of multiple decoders that integrate the detection-based mechanism and free-form region based attention mechanism. Our proposed approach effectively produce more precise caption by co-attending the free-form regions and detections. Particularly, given the "Teacher-Forcing", which leads to a mismatch between training and testing, and exposure bias, we use a reinforcement learning approach to optimize. The proposed method is evaluated on the benchmark MSCOCO dataset, and achieves state-of-the-art performance.

Year	DOI	Venue
2018	10.1007/978-3-030-00767-6_19	ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II
Keywords	Field	DocType
Co-attention,Image captioning,Multi-decoder	Computer vision,Closed captioning,Computer science,Artificial intelligence,Machine learning,Reinforcement learning	Conference
Volume	ISSN	Citations
11165	0302-9743	0
PageRank	References	Authors
0.34	13	5

Authors (5 rows)

Cited by (0 rows)

References (13 rows)

Name	Order	Citations	PageRank
Zhen Sun	1	9	4.89
Xin Lin	2	66	18.39
Zhaohui Wang	3	7	7.58
Yi Ji	4	80	13.06
Chunping Liu	5	27	10.56

1