Sports Video Captioning via Attentive Motion Representation and Group Relationship Modeling - Citegraph

Paper Info

Title
Sports Video Captioning via Attentive Motion Representation and Group Relationship Modeling

Abstract
AbstractSports video captioning refers to the task of automatically generating a textual description for sports events (football, basketball, or volleyball games). Although a great deal of previous work has shown promising performance in producing a coarse and a general description of a video but lack of professional sports knowledge, it is still quite challenging to caption a sports video with multiple fine-grained player’s actions and complex group relationship between players. In this paper, we present a novel hierarchical recurrent neural network-based framework with an attention mechanism for sports video captioning, in which a motion representation module is proposed to capture individual pose attribute and dynamical trajectory cluster information with extra professional sports knowledge, and a group relationship module is employed to design a scene graph for modeling players’ interaction by a gated graph convolutional network. Moreover, we introduce a new dataset called sports video captioning dataset-volleyball for evaluation. The proposed model is evaluated on three widely adopted public datasets and our collected new dataset, on which the effectiveness of our method is well demonstrated.

Year	DOI	Venue
2020	10.1109/TCSVT.2019.2921655	Periodicals
Keywords	DocType	Volume
Sports, Visualization, Trajectory, Semantics, Task analysis, Logic gates, Games, Sports video, video captioning, motion representation, group relationship, RNN	Journal	30
Issue	ISSN	Citations
8	1051-8215	6
PageRank	References	Authors
0.70	29	4

Authors (4 rows)

Cited by (6 rows)

References (29 rows)

Name	Order	Citations	PageRank
Mengshi Qi	1	36	3.91
Yunhong Wang	2	3816	278.50
Annan Li	3	222	14.22
Jiebo Luo	4	6314	374.00

1