Who You Are Decides How You Tell - Citegraph

Paper Info

Title
Who You Are Decides How You Tell

Abstract
Image captioning is gaining significance in multiple applications such as content-based visual search and chat-bots. Much of the recent progress in this field embraces a data-driven approach without deep consideration of human behavioural characteristics. In this paper, we focus on human-centered automatic image captioning. Our study is based on the intuition that different people will generate a variety of image captions for the same scene, as their knowledge and opinion about the scene may differ. In particular, we first perform a series of human studies to investigate what influences human description of a visual scene. We identify three main factors: a person's knowledge level of the scene, opinion on the scene, and gender. Based on our human study findings, we propose a novel human-centered algorithm that is able to generate human-like image captions. We evaluate the proposed model through traditional evaluation metrics, diversity metrics, and human-based evaluation. Experimental results demonstrate the superiority of our proposed model on generating diverse human-like image captions.

Year	DOI	Venue
2020	10.1145/3394171.3413589	MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020
DocType	ISBN	Citations
Conference	978-1-4503-7988-5	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Shuang Wu	1	0	0.34
Shaojing Fan	2	22	5.63
Zhiqi Shen	3	1148	82.57
Mohan Kankanhalli	4	3825	299.56
Anthony K. H. Tung	5	3263	189.90

1