Abstract | ||
---|---|---|
Various ideal masks have been used as the training targets for supervised speech separation. While different choices often lead to different results, the reason remains unclear. In this paper, an oracle method is applied to investigate the properties of the ideal masks including the ideal binary mask (IBM), the ideal ratio mask (IRM), the phase sensitive mask (PSM) and the complex ideal ratio maks (cIRM). They are evaluated in terms of intelligibility and quality under a general speech separation scenario. The relative importance of phase is also taken into consideration. Moreover, we introduce a novel ideal gain mask (IGM) which performs generally better than IRM and one alternative of the cIRM. It is shown that the masks are not equally designed and the results provide a strong reference to the potential performance of supervised algorithms. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/IWAENC.2016.7602888 | 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) |
Keywords | Field | DocType |
supervised speech separation,ideal masks,speech intelligibility and quality | Strong reference,IBM,Noise measurement,Computer science,Signal-to-noise ratio,Oracle,Speech recognition,Distortion,Intelligibility (communication),Binary number | Conference |
ISBN | Citations | PageRank |
978-1-5090-2008-9 | 5 | 0.47 |
References | Authors | |
12 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ziteng Wang | 1 | 17 | 5.55 |
Xiaofei Wang | 2 | 13 | 4.99 |
Xu Li | 3 | 8 | 2.22 |
Qiang Fu | 4 | 791 | 81.92 |
Yonghong Yan | 5 | 656 | 114.13 |