Title | ||
---|---|---|
Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising Networks |
Abstract | ||
---|---|---|
Audio-Guided video object segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from the background in a video sequence according to the referring audio expressions. However, existing referring video object segmentation works mainly focus on the guidance of text-based referring expressions, due to the lack of modeling the semantic representations of audio-video interaction contents. In this paper, we consider the problem of audio-guided video semantic segmentation from the viewpoint of end-to-end denoising encoder-decoder network learning. We propose the wavelet-based encoder network to learn the cross-modal representations of the video contents with audio-form queries. Specifically, we adopt the multi-head cross-modal attention layers to explore the potential relations of video and query contents. A 2-dimension discrete wavelet trans-form is merged into the transformer encoder to decompose the audio-video features. Next, we maximize mutual information between the encoded features and multi-modal features after cross-modal attention layers to enhance the au-dio guidance. Then, a self attention-free decoder network is developed to generate the target masks with frequency-domain transforms. In addition, we construct the first large-scale audio-guided video semantic segmentation dataset. The extensive experiments show the effectiveness of our method 1 1 Code is available at: https://github.com/asudahkzj/Wnet.git. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/CVPR52688.2022.00138 | IEEE Conference on Computer Vision and Pattern Recognition |
Keywords | DocType | Volume |
Segmentation,grouping and shape analysis, Machine learning, Vision + X | Conference | 2022 |
Issue | Citations | PageRank |
1 | 0 | 0.34 |
References | Authors | |
0 | 10 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wenwen Pan | 1 | 0 | 0.34 |
Haonan Shi | 2 | 0 | 0.34 |
Zhou Zhao | 3 | 773 | 90.87 |
Jieming Zhu | 4 | 0 | 0.34 |
Xiuqiang He | 5 | 312 | 39.21 |
Zhigeng Pan | 6 | 0 | 0.34 |
Lianli Gao | 7 | 550 | 42.85 |
Jun Yu | 8 | 2597 | 105.69 |
Fei Wu | 9 | 2209 | 153.88 |
Qi Tian | 10 | 6443 | 331.75 |