Title | ||
---|---|---|
Sports Video Captioning via Attentive Motion Representation and Group Relationship Modeling |
Abstract | ||
---|---|---|
AbstractSports video captioning refers to the task of automatically generating a textual description for sports events (football, basketball, or volleyball games). Although a great deal of previous work has shown promising performance in producing a coarse and a general description of a video but lack of professional sports knowledge, it is still quite challenging to caption a sports video with multiple fine-grained player’s actions and complex group relationship between players. In this paper, we present a novel hierarchical recurrent neural network-based framework with an attention mechanism for sports video captioning, in which a motion representation module is proposed to capture individual pose attribute and dynamical trajectory cluster information with extra professional sports knowledge, and a group relationship module is employed to design a scene graph for modeling players’ interaction by a gated graph convolutional network. Moreover, we introduce a new dataset called sports video captioning dataset-volleyball for evaluation. The proposed model is evaluated on three widely adopted public datasets and our collected new dataset, on which the effectiveness of our method is well demonstrated. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/TCSVT.2019.2921655 | Periodicals |
Keywords | DocType | Volume |
Sports, Visualization, Trajectory, Semantics, Task analysis, Logic gates, Games, Sports video, video captioning, motion representation, group relationship, RNN | Journal | 30 |
Issue | ISSN | Citations |
8 | 1051-8215 | 6 |
PageRank | References | Authors |
0.70 | 29 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mengshi Qi | 1 | 36 | 3.91 |
Yunhong Wang | 2 | 3816 | 278.50 |
Annan Li | 3 | 222 | 14.22 |
Jiebo Luo | 4 | 6314 | 374.00 |