Abstract | ||
---|---|---|
Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, which is known as catastrophic forgetting. To learn new task without forgetting, recently, the mask-based learning method (e.g. piggyback [10]) is proposed to address this issue by learning only a binary element-wise mask, while keeping the backbone model fixed. However, the binary mask has limited modeling capacity for new tasks. A more recent work [5] proposes a compress-grow-based method (CPG) to achieve better accuracy for new tasks by partially training backbone model, but with order-higher training cost, which makes it infeasible to be deployed into popular state-of-the-art edge-/mobile-learning. The primary goal of this work is to simultaneously achieve fast and high-accuracy multi task adaption in continual learning setting. Thus motivated, we propose a new training method called Kernel-wise Soft Mask (KSM), which learns a kernel-wise hybrid binary and real-value soft mask for each task. Such a hybrid mask can be viewed as a superposition of a binary mask and a properly scaled real-value tensor, which offers a richer representation capability without low-level kernel support to meet the objective of low hardware overhead. We validate KSM on multiple benchmark datasets against recent state-of-the-art methods (e.g. Piggyback, Packnet, CPG, etc.), which shows good improvement in both accuracy and training cost. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/CVPR46437.2021.01363 | 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 |
DocType | ISSN | Citations |
Conference | 1063-6919 | 0 |
PageRank | References | Authors |
0.34 | 8 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Li Yang | 1 | 10 | 4.50 |
Zhezhi He | 2 | 136 | 25.37 |
Junshan Zhang | 3 | 2905 | 220.99 |
Deliang Fan | 4 | 375 | 53.66 |