Abstract | ||
---|---|---|
In single-channel speech enhancement, methods based on full-band spectral features have been widely studied. However, only a few methods pay attention to non-full-band spectral features. In this paper, we explore a knowledge distillation framework based on sub-band spectral mapping for single-channel speech enhancement. Specifically, we divide the full frequency band into multiple sub-bands and pre-train an elite-level sub-band enhancement model (teacher model) for each sub-band. These teacher models are dedicated to processing their own sub-bands. Next, under the teacher models' guidance, we train a general sub-band enhancement model (student model) that works for all sub-bands. Without increasing the number of model parameters and computational complexity, the student model's performance is further improved. To evaluate our proposed method, we conducted a large number of experiments on an open-source data set. The final experimental results show that the guidance from the elite-level teacher models dramatically improves the student model's performance, which exceeds the full-band model by employing fewer parameters. |
Year | DOI | Venue |
---|---|---|
2020 | 10.21437/Interspeech.2020-1539 | INTERSPEECH |
DocType | Citations | PageRank |
Conference | 1 | 0.36 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiang Hao | 1 | 24 | 7.87 |
Wen Shixue | 2 | 1 | 0.36 |
Su Xiangdong | 3 | 1 | 0.36 |
Liu Yun | 4 | 1 | 2.05 |
Guanglai Gao | 5 | 78 | 24.57 |
Xiaofei Li | 6 | 103 | 24.78 |