Abstract | ||
---|---|---|
We investigate the design aspects of feature distillation methods achieving network compression and propose a novel feature distillation method in which the distillation loss is designed to make a synergy among various aspects: teacher transform, student transform, distillation feature position and distance function. Our proposed distillation loss includes a feature transform with a newly designed margin ReLU, a new distillation feature position, and a partial L-2 distance function to skip redundant information giving adverse effects to the compression of student. In ImageNet, our proposed method achieves 21.65% of top-1 error with ResNet50, which outperforms the performance of the teacher network, ResNet152. Our proposed method is evaluated on various tasks such as image classification, object detection and semantic segmentation and achieves a significant performance improvement in all tasks. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICCV.2019.00201 | 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) |
Field | DocType | Volume |
Object detection,Pattern recognition,Segmentation,Computer science,Metric (mathematics),Distillation,Artificial intelligence,Feature transform,Contextual image classification,Performance improvement | Journal | abs/1904.01866 |
Issue | ISSN | Citations |
1 | 1550-5499 | 10 |
PageRank | References | Authors |
0.55 | 0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Byeongho Heo | 1 | 40 | 7.28 |
Kim Jee-Soo | 2 | 11 | 2.61 |
Sangdoo Yun | 3 | 34 | 2.85 |
Hyojin Park | 4 | 10 | 0.89 |
Nojun Kwak | 5 | 862 | 63.79 |
Jin Young Choi | 6 | 768 | 99.57 |