Abstract | ||
---|---|---|
Hidden Networks (Ramanujan et al., 2020) showed the possibility of finding accurate subnetworks within a randomly weighted neural network by training a connectivity mask, referred to as supermask. We show that the supermask stops improving even though gradients are not zero, thus underutilizing backpropagated information. To address this we propose a method that extends Hidden Networks by training an overlay of multiple hierarchical supermasks{—}a multicoated supermask. This method shows that using multiple supermasks for a single task achieves higher accuracy without additional training cost. Experiments on CIFAR-10 and ImageNet show that Multicoated Supermasks enhance the tradeoff between accuracy and model size. A ResNet-101 using a 7-coated supermask outperforms its Hidden Networks counterpart by 4%, matching the accuracy of a dense ResNet-50 while being an order of magnitude smaller. |
Year | Venue | DocType |
---|---|---|
2022 | International Conference on Machine Learning | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yasuyuki Okoshi | 1 | 0 | 0.34 |
Ángel López García-Arias | 2 | 0 | 0.68 |
Kazutoshi Hirose | 3 | 1 | 1.39 |
Kota Ando | 4 | 24 | 6.81 |
Kazushi Kawamura | 5 | 3 | 2.58 |
Thiem Van Chu | 6 | 1 | 2.74 |
Masato Motomura | 7 | 8 | 3.65 |
Jaehoon Yu | 8 | 28 | 22.44 |