Title
Siamese Dynamic Mask Estimation Network For Fast Video Object Segmentation
Abstract
Video object segmentation(VOS) has been a fundamental topic in recent years, and many deep learning-based methods have achieved state-of-the-art performance on multiple benchmarks. However, most of these methods rely on pixel-level matching between the template and the searched frames on the whole image while the targets only occupy a small region. Calculating on the entire image brings lots of additional computation cost. Besides, the whole image may contain some distracting information resulting in many false-positive matching points. To address this issue, motivated by one-stage instance object segmentation methods, we propose an efficient siamese dynamic mask estimation network for fast video object segmentation. The VOS is decoupled into two tasks, i.e., mask feature learning and dynamic kernel prediction. The former is responsible for learning high-quality features to preserve structural geometric information, and the latter learns a dynamic kernel that is used to convolve with the mask feature to generate a mask output. We use Siamese neural network as a feature extractor and directly predict masks after correlation. In this way, we can avoid using pixel-level matching, making our framework more simple and efficient. Experiment results on DAVIS 2016/2017 datasets show that our proposed methods can run at 35 frames per second on NVIDIA RTX TITAN while preserving competitive accuracy.
Year
DOI
Venue
2020
10.1109/ICPR48806.2021.9412609
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
DocType
ISSN
Citations 
Conference
1051-4651
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Dexiang Hong101.01
Guorong Li219619.93
Kai Xu300.34
Li Su485.24
Qingming Huang53919267.71