Abstract | ||
---|---|---|
This study is concerned with the challenge of automatically segregating a target speech signal from interfering background noise. A computational speech segregation system is presented which exploits logarithmically-scaled amplitude modulation spectrogram (AMS) features to distinguish between speech and noise activity on the basis of individual time-frequency (T-F) units. One important parameter of the segregation system is the window duration of the analysis-synthesis stage, which determines the lower limit of modulation frequencies that can be represented but also the temporal acuity with which the segregation system can manipulate individual T-F units. To clarify the consequences of this trade-off on modulation-based speech segregation performance, the influence of the window duration was systematically investigated. |
Year | Venue | Keywords |
---|---|---|
2015 | 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | speech segregation, ideal binary mask, amplitude modulation spectrogram features, temporal resolution |
Field | DocType | Citations |
Pattern recognition,Computer science,Speech recognition,Modulation,Artificial intelligence,Temporal resolution,Speech segregation | Conference | 1 |
PageRank | References | Authors |
0.35 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tobias May | 1 | 43 | 4.97 |
Thomas Bentsen | 2 | 1 | 0.35 |
Torsten Dau | 3 | 56 | 10.01 |