Title
A Spectral-Change-Aware Loss Function For Dnn-Based Speech Separation
Abstract
Speech separation can be treated as a mask estimation problem where supervised learning is employed to construct the mapping from acoustic features to a mask. Interference can be reduced by applying the estimated mask on a time-frequency (T-F) representation of noisy speech, resulting in improved speech intelligibility. Most of existing learning networks for speech separation aim to minimize the Mean Square Error (MSE) over the training set, where the loss from each T-F representation is equally weighted. In this paper, we proposed a spectral-change-aware loss function, where loss from the T-F units with large spectral changes over time were assigned higher weights compared to the T-F units with minor spectral changes. Such spectral-change-aware loss function was evaluated on speech separation performance in terms of mask estimation accuracy, short-time objective intelligibility (STOI) and SNR gain of unvoiced segments. The results indicated that the proposed loss function could further improve the speech intelligibility and increase SNR gain of unvoiced segments even in the cost of increased error rate of estimated mask.
Year
DOI
Venue
2019
10.1109/icassp.2019.8683850
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords
Field
DocType
Speech separation, loss function, spectral change, speech intelligibility
Training set,Pattern recognition,Computer science,Word error rate,Mean squared error,Supervised learning,Interference (wave propagation),Artificial intelligence,Intelligibility (communication)
Conference
ISSN
Citations 
PageRank 
1520-6149
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Xiang Li1121.33
Xihong Wu241.49
Jing Chen328560.83