Title
Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural Networks.
Abstract
Sports video captioning is a task of automatically generating a textual description for sports events (e.g. football, basketball or volleyball games). Although previous works have shown promising performance in producing the coarse and general description of a video, it is still quite challenging to caption a sports video with multiple fine-grained player's actions and complex group relationship among players. In this paper, we present a novel hierarchical recurrent neural network (RNN) based framework with an attention mechanism for sports video captioning. A motion representation module is proposed to extract individual pose attribute and group-level trajectory cluster information. Moreover, we introduce a new dataset called Sports Video Captioning Dataset-Volleyball for evaluation. We evaluate our proposed model over two public datasets and our new dataset, and the experimental results demonstrate that our method outperforms the state-of-the-art methods.
Year
DOI
Venue
2018
10.1145/3265845.3265851
MM '18: ACM Multimedia Conference Seoul Republic of Korea October, 2018
DocType
ISBN
Citations 
Conference
978-1-4503-5981-8
3
PageRank 
References 
Authors
0.38
16
4
Name
Order
Citations
PageRank
Mengshi Qi1363.91
Yunhong Wang23816278.50
Annan Li322214.22
Jiebo Luo46314374.00