Siamese Dynamic Mask Estimation Network For Fast Video Object Segmentation - Citegraph

Paper Info

Title
Siamese Dynamic Mask Estimation Network For Fast Video Object Segmentation

Abstract
Video object segmentation(VOS) has been a fundamental topic in recent years, and many deep learning-based methods have achieved state-of-the-art performance on multiple benchmarks. However, most of these methods rely on pixel-level matching between the template and the searched frames on the whole image while the targets only occupy a small region. Calculating on the entire image brings lots of additional computation cost. Besides, the whole image may contain some distracting information resulting in many false-positive matching points. To address this issue, motivated by one-stage instance object segmentation methods, we propose an efficient siamese dynamic mask estimation network for fast video object segmentation. The VOS is decoupled into two tasks, i.e., mask feature learning and dynamic kernel prediction. The former is responsible for learning high-quality features to preserve structural geometric information, and the latter learns a dynamic kernel that is used to convolve with the mask feature to generate a mask output. We use Siamese neural network as a feature extractor and directly predict masks after correlation. In this way, we can avoid using pixel-level matching, making our framework more simple and efficient. Experiment results on DAVIS 2016/2017 datasets show that our proposed methods can run at 35 frames per second on NVIDIA RTX TITAN while preserving competitive accuracy.

Year	DOI	Venue
2020	10.1109/ICPR48806.2021.9412609	2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
DocType	ISSN	Citations
Conference	1051-4651	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Dexiang Hong	1	0	1.01
Guorong Li	2	196	19.93
Kai Xu	3	0	0.34
Li Su	4	8	5.24
Qingming Huang	5	3919	267.71

1