Abstract | ||
---|---|---|
In this paper, we propose a novel approach of video text keyframe detection, to achieve the goal of representing the video with textual keyframes and reducing the waste of resources in video review. Different from the works of video summarization which mainly focus on the variation of the scenes in videos, we pay attention to the variances of the text between sequential frames. For the above purpose, Text Siamese Network (TSN) is developed to automatically detect the keyframes which contain text in videos. Specifically, the TSN is composed of two branches, text similarity measurement and text identification. The first branch is utilized to evaluate the similarity between consecutive frames. Furthermore, an attention block is used to select the informative features to identify whether a frame contains text or not in the second branch. Additionally, a new dataset called VTKD2019 is proposed for video text keyframe detection. VTKD2019 contains 571 videos and is spit into three levels (easy, medium and hard) for evaluation. The experimental results on the VTKD2019 and ICDAR2015 demonstrate the effectiveness of our method. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICDAR.2019.00077 | 2019 International Conference on Document Analysis and Recognition (ICDAR) |
Keywords | Field | DocType |
video textual keyframe detection,Text Siamese Network(TSN),attention mechanism | Automatic summarization,Pattern recognition,Computer science,Natural language processing,Artificial intelligence | Conference |
ISSN | ISBN | Citations |
1520-5363 | 978-1-7281-3015-6 | 0 |
PageRank | References | Authors |
0.34 | 4 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hao Song | 1 | 0 | 0.34 |
Hongzhen Wang | 2 | 0 | 0.34 |
Shan Huang | 3 | 0 | 0.34 |
Pei Xu | 4 | 0 | 0.34 |
Shen Huang | 5 | 11 | 4.61 |
Qi Ju | 6 | 0 | 0.34 |