Abstract | ||
---|---|---|
Panoptic segmentation aims to partition an image to object instances and semantic content for thing and stuff categories, respectively. To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored. In this paper, we propose an efficient jointly thing-and-stuff mining (JTSM) framework for WSPS. To this end, we design a novel mask of interest pooling (MoIPool) to extract fixed-size pixel-accurate feature maps of arbitrary-shape segmentations. MoIPool enables a panoptic mining branch to leverage multiple instance learning (MIL) to recognize things and stuff segmentation in a unified manner. We further refine segmentation masks with parallel instance and semantic segmentation branches via self-training, which collaborates the mined masks from panoptic mining with bottom-up object evidence as pseudo-ground-truth labels to improve spatial coherence and contour localization. Experimental results demonstrate the effectiveness of JTSM on PASCAL VOC and MS COCO. As a by-product, we achieve competitive results for weakly supervised object detection and instance segmentation. This work is a first step towards tackling challenge panoptic segmentation task with only image-level labels. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/CVPR46437.2021.01642 | 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 |
DocType | ISSN | Citations |
Conference | 1063-6919 | 0 |
PageRank | References | Authors |
0.34 | 0 | 9 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yunhang Shen | 1 | 29 | 7.25 |
Liujuan Cao | 2 | 213 | 27.37 |
Zhiwei Chen | 3 | 0 | 2.37 |
Feihong Lian | 4 | 0 | 0.68 |
Baochang Zhang | 5 | 1130 | 93.76 |
Chi Su | 6 | 0 | 0.68 |
Yongjian Wu | 7 | 24 | 3.49 |
Feiyue Huang | 8 | 226 | 41.86 |
Rongrong Ji | 9 | 3616 | 189.98 |