Title
Text Siamese Network for Video Textual Keyframe Detection
Abstract
In this paper, we propose a novel approach of video text keyframe detection, to achieve the goal of representing the video with textual keyframes and reducing the waste of resources in video review. Different from the works of video summarization which mainly focus on the variation of the scenes in videos, we pay attention to the variances of the text between sequential frames. For the above purpose, Text Siamese Network (TSN) is developed to automatically detect the keyframes which contain text in videos. Specifically, the TSN is composed of two branches, text similarity measurement and text identification. The first branch is utilized to evaluate the similarity between consecutive frames. Furthermore, an attention block is used to select the informative features to identify whether a frame contains text or not in the second branch. Additionally, a new dataset called VTKD2019 is proposed for video text keyframe detection. VTKD2019 contains 571 videos and is spit into three levels (easy, medium and hard) for evaluation. The experimental results on the VTKD2019 and ICDAR2015 demonstrate the effectiveness of our method.
Year
DOI
Venue
2019
10.1109/ICDAR.2019.00077
2019 International Conference on Document Analysis and Recognition (ICDAR)
Keywords
Field
DocType
video textual keyframe detection,Text Siamese Network(TSN),attention mechanism
Automatic summarization,Pattern recognition,Computer science,Natural language processing,Artificial intelligence
Conference
ISSN
ISBN
Citations 
1520-5363
978-1-7281-3015-6
0
PageRank 
References 
Authors
0.34
4
6
Name
Order
Citations
PageRank
Hao Song100.34
Hongzhen Wang200.34
Shan Huang300.34
Pei Xu400.34
Shen Huang5114.61
Qi Ju600.34