Abstract | ||
---|---|---|
This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints. Our proposed network starts from a single lightweight backbone and aggregates discriminative features through sub-network and sub-stage cascade respectively. Based on the multi-scale feature propagation, DFANet substantially reduces the number of parameters, but still obtains sufficient receptive field and enhances the model learning ability, which strikes a balance between the speed and segmentation performance. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of DFANet with 8 x less FLOPs and 2 x faster than the existing state-of-the-art real-time semantic segmentation methods while providing comparable accuracy. Specifically, it achieves 70.3% Mean IOU on the Cityscapes test dataset with only 1.7 GFLOPs and a speed of 160 FPS on one NVIDIA Titan X card, and 71.3% Mean IOU with 3.4 GFLOPs while inferring on a higher resolution image. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/CVPR.2019.00975 | 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) |
Field | DocType | Volume |
Pattern recognition,Computer science,Segmentation,FLOPS,Artificial intelligence,Cascade,Feature aggregation,Discriminative model,Model learning | Journal | abs/1904.02216 |
ISSN | Citations | PageRank |
1063-6919 | 15 | 0.53 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hanchao Li | 1 | 15 | 0.53 |
Pengfei Xiong | 2 | 38 | 3.69 |
Haoqiang Fan | 3 | 227 | 12.94 |
Jian Sun | 4 | 25842 | 956.90 |