TIGEr: Text-to-Image Grounding for Image Caption Evaluation - Citegraph

Paper Info

Title
TIGEr: Text-to-Image Grounding for Image Caption Evaluation

Abstract
This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. Popular metrics, such as BLEU and CIDEr, are based solely on text matching between reference captions and machine-generated captions, potentially leading to biased evaluations because references may not fully cover the image content and natural language is inherently ambiguous. Building upon a machine-learned text-image grounding model, TIGEr allows to evaluate caption quality not only based on how well a caption represents image content, but also on how well machine-generated captions match human-generated captions. Our empirical tests show that TIGEr has a higher consistency with human judgments than alternative existing metrics. We also comprehensively assess the metric's effectiveness in caption evaluation by measuring the correlation between human judgments and metric scores.

Year	DOI	Venue
2019	10.18653/v1/D19-1220	EMNLP/IJCNLP (1)
DocType	Volume	Citations
Conference	D19-1	1
PageRank	References	Authors
0.34	0	8

Authors (8 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ming Jiang	1	1	0.68
Qiuyuan Huang	2	176	17.66
Lei Zhang	3	1	0.34
Xin Wang	4	1	0.34
Pengchuan Zhang	5	31	8.17
Zhe Gan	6	319	32.58
Jana Diesner	7	216	24.38
Jianfeng Gao	8	5729	296.43

1