A Spectral-Change-Aware Loss Function For Dnn-Based Speech Separation - Citegraph

Paper Info

Title
A Spectral-Change-Aware Loss Function For Dnn-Based Speech Separation

Abstract
Speech separation can be treated as a mask estimation problem where supervised learning is employed to construct the mapping from acoustic features to a mask. Interference can be reduced by applying the estimated mask on a time-frequency (T-F) representation of noisy speech, resulting in improved speech intelligibility. Most of existing learning networks for speech separation aim to minimize the Mean Square Error (MSE) over the training set, where the loss from each T-F representation is equally weighted. In this paper, we proposed a spectral-change-aware loss function, where loss from the T-F units with large spectral changes over time were assigned higher weights compared to the T-F units with minor spectral changes. Such spectral-change-aware loss function was evaluated on speech separation performance in terms of mask estimation accuracy, short-time objective intelligibility (STOI) and SNR gain of unvoiced segments. The results indicated that the proposed loss function could further improve the speech intelligibility and increase SNR gain of unvoiced segments even in the cost of increased error rate of estimated mask.

Year	DOI	Venue
2019	10.1109/icassp.2019.8683850	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords	Field	DocType
Speech separation, loss function, spectral change, speech intelligibility	Training set,Pattern recognition,Computer science,Word error rate,Mean squared error,Supervised learning,Interference (wave propagation),Artificial intelligence,Intelligibility (communication)	Conference
ISSN	Citations	PageRank
1520-6149	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiang Li	1	12	1.33
Xihong Wu	2	4	1.49
Jing Chen	3	285	60.83

1