Abstract | ||
---|---|---|
Target speaker separation aims to separate a target speech from multiple interference voices, which is promising for solving conventional difficulties in speech separation, such as arbitrary source permutation and unknown number of sources, and is useful for personal applications, like online meeting and personal phone calls. Recently, the application of deep-learning based models provided more alternatives for target speaker separation tasks. In this paper, we proposed a target speaker separation neural network with joint-training that separates the target voice in the spectrogram domain with the proposed combinative loss function. Experimental results show that compared with the baseline, our proposed method yields better performance on both test data and real data. Meanwhile, the proposed combinative loss function is more effective in addressing this issue. |
Year | Venue | DocType |
---|---|---|
2021 | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | Conference |
ISSN | Citations | PageRank |
2309-9402 | 0 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wenjing Yang | 1 | 0 | 0.68 |
Jing Wang | 2 | 0 | 3.72 |
Hongfeng Li | 3 | 0 | 0.34 |
Na Xu | 4 | 0 | 0.34 |
Fei Xiang | 5 | 0 | 0.34 |
Kai Qian | 6 | 0 | 0.34 |
Shenghua Hu | 7 | 0 | 0.68 |