Title
From Coarse to Fine: Hierarchical Structure-aware Video Summarization
Abstract
AbstractHierarchical structure is a common characteristic for some kinds of videos (e.g., sports videos, game videos): The videos are composed of several actions hierarchically and there exist temporal dependencies among segments with different scales, where action labels can be enumerated. Our ideas are based on two observations: First, the actions are the fundamental units for people to understand these videos. Second, the humans summarize a video by iteratively observing and refining, i.e., observing segments in video and hierarchically refining the boundaries of important actions. Based on the above insights, we generate action proposals to construct the structure of the video and formulate the summarization process as a hierarchical refining process. We also train a hierarchical summarization network with deep Q-learning (HQSN) to achieve the refining process and explore temporal dependency. Besides, we collect a new dataset that consists of structured game videos with fine-grain actions and importance annotations. The experimental results demonstrate the effectiveness of the proposed method.
Year
DOI
Venue
2022
10.1145/3485472
ACM Transactions on Multimedia Computing, Communications, and Applications
Keywords
DocType
Volume
Reinforcement learning, video understanding
Journal
18
Issue
ISSN
Citations 
1s
1551-6857
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Wenxu Li100.34
Gang Pan21501123.57
Chen Wang339.53
Zhen Xing400.34
Zhenjun Han517616.40