Video Captioning Using Global-Local Representation - Citegraph

Paper Info

Title
Video Captioning Using Global-Local Representation

Abstract
Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description. To date, state-of-the-art methods inadequately model global-local vision representation for sentence generation, leaving plenty of room for improvement. In this work, we approach the video captioning task from a new perspective and propose a GLR framework, namely a global-local representation granularity. Our GLR demonstrates three advantages over the prior efforts. First, we propose a simple solution, which exploits extensive vision representations from different video ranges to improve linguistic expression. Second, we devise a novel global-local encoder, which encodes different video representations including long-range, short-range and local-keyframe, to produce rich semantic vocabulary for obtaining a descriptive granularity of video contents across frames. Finally, we introduce the progressive training strategy which can effectively organize feature learning to incur optimal captioning behavior. Evaluated on the MSR-VTT and MSVD dataset, we outperform recent state-of-the-art methods including a well-tuned SA-LSTM baseline by a significant margin, with shorter training schedules. Because of its simplicity and efficacy, we hope that our GLR could serve as a strong baseline for many video understanding tasks besides video captioning. Code will be available.

Year	DOI	Venue
2022	10.1109/TCSVT.2022.3177320	IEEE Transactions on Circuits and Systems for Video Technology
Keywords	DocType	Volume
Computer vision,video captioning,video representation,natural language processing,visual analysis	Journal	32
Issue	ISSN	Citations
10	1051-8215	0
PageRank	References	Authors
0.34	24	7

Authors (7 rows)

Cited by (0 rows)

References (24 rows)

Name	Order	Citations	PageRank
Liqi Yan	1	0	1.01
Siqi Ma	2	0	0.34
Qifan Wang	3	0	0.34
Victor Yingjie Chen	4	52	27.37
Xiangyu Zhang	5	2857	151.00
Andreas Savakis	6	377	41.10
Dongfang Liu	7	0	1.69

1