Accelerating Neural Architecture Search for Natural Language Processing with Knowledge Distillation and Earth Mover's Distance - Citegraph

Paper Info

Title
Accelerating Neural Architecture Search for Natural Language Processing with Knowledge Distillation and Earth Mover's Distance

Abstract
ABSTRACTRecent AI research has witnessed increasing interests in automatically designing the architecture of deep neural networks, which is coined as neural architecture search (NAS). The automatically searched network architectures via NAS methods have outperformed manually designed architectures on some NLP tasks. However, training a large number of model configurations for efficient NAS is computationally expensive, creating a substantial barrier for applying NAS methods in real-life applications. In this paper, we propose to accelerate neural architecture search for natural language processing based on knowledge distillation (called KD-NAS). Specifically, instead of searching the optimal network architecture on the validation set conditioned on the optimal network weights on the training set, we learn the optimal network by minimizing the knowledge loss transferred from a pre-trained teacher network to the searching network based on Earth Mover's Distance (EMD). Experiments on five datasets show that our method achieves promising performance compared to strong competitors on both accuracy and searching speed. For reproducibility, we submit the code at: https://github.com/lxk00/KD-NAS-EMD.

Year	DOI	Venue
2021	10.1145/3404835.3463017	Research and Development in Information Retrieval
Keywords	DocType	Citations
Neural architecture search, knowledge distillation, earth mover's distance	Conference	0
PageRank	References	Authors
0.34	0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jianquan Li	1	7	5.26
Xiaokang Liu	2	0	1.01
Sheng Zhang	3	6	1.44
Min Yang	4	77	20.41
Xu Ruifeng	5	432	53.04
Feng-qing Qin	6	22	1.88

1