OCNet: Object Context Network for Scene Parsing. - Citegraph

Paper Info

Title
OCNet: Object Context Network for Scene Parsing.

Abstract
In this paper, we address the problem of scene parsing with deep learning and focus on the context aggregation strategy for robust segmentation. Motivated by that the label of a pixel is the category of the object that the pixel belongs to, we introduce an emph{object context pooling (OCP)} scheme, which represents each pixel by exploiting the set of pixels that belong to the same object category with such a pixel, and we call the set of pixels as object context. Our implementation, inspired by the self-attention approach, consists of two steps: (i) compute the similarities between each pixel and all the pixels, forming a so-called object context map for each pixel served as a surrogate for the true object context, and (ii) represent the pixel by aggregating the features of all the pixels weighted by the similarities. The resulting representation is more robust compared to existing context aggregation schemes, e.g., pyramid pooling modules (PPM) in PSPNet and atrous spatial pyramid pooling (ASPP), which do not differentiate the context pixels belonging to the same object category or not, making the reliability of contextually aggregated representations limited. We empirically demonstrate our approach and two pyramid extensions with state-of-the-art performance on three semantic segmentation benchmarks: Cityscapes, ADE20K and LIP. Code has been made available at: this https URL.

Year	Venue	Field
2018	arXiv: Computer Vision and Pattern Recognition	Pattern recognition,Segmentation,Computer science,Pooling,Pyramid,Artificial intelligence,Pixel,Parsing,Deep learning
DocType	Volume	Citations
Journal	abs/1809.00916	12
PageRank	References	Authors
0.53	15	2

Authors (2 rows)

Cited by (12 rows)

References (15 rows)

Name	Order	Citations	PageRank
Yuhui Yuan	1	34	3.34
Jingdong Wang	2	4198	156.76

1