ADAPTABLE ENSEMBLE DISTILLATION - Citegraph

Paper Info

Title
ADAPTABLE ENSEMBLE DISTILLATION

Abstract
Online knowledge distillation (OKD), which simultaneously trains several peer networks to construct a powerful teacher on on-the-fly, has drawn much attention in recent years. OKD is designed to simplify the training procedure of conventional offline distillation. However, the ensemble strategy of existing OKD methods is inflexible and highly relies on random initializations. In this paper, we propose Adaptable Ensemble Distillation (AED) that inherits the merits of existing OKD methods while overcoming their major drawbacks. The novelty of our AED lies in three aspects: (1) an individual-regulated mechanism is proposed to flexibly regulate individual model and further generates an online ensemble with strong adaptability; (2) a diversity-aroused loss is designed to explicitly diversify individual models, which enhances the robustness of the ensemble; (3) an empirical distillation technique is adopted to directly promote knowledge transfer in OKD framework. Extensive experiments show that our proposed AED consistently outperforms the existing state-of-the-art OKD methods on various datasets.

Year	DOI	Venue
2021	10.1109/ICASSP39728.2021.9415015	2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)
Keywords	DocType	Citations
Online knowledge distillation, Individual-regulated mechanism, Diversity-aroused loss, Neural network ensemble	Conference	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yankai Wang	1	0	1.01
Dawei Yang	2	0	0.34
Wei Zhang	3	452	19.35
Zhe Jiang	4	26	9.94
Wenqiang Zhang	5	0	0.34

1