Abstract | ||
---|---|---|
In this paper, we investigate the effect of temporal context for speech/ non-speech detection (SND). It is shown that even a simple feature such as full-band energy, when employed with a large-enough context, shows promise for further investigation. Experimental evaluations on the test data set, with a state-of-the-art multi-layer perceptron based SND system and a simple energy threshold based SND method, using the F-measure, show an absolute performance gain of 4.4% and 5.4% respectively. The optimal contextual length was found to be 1000 ms. Further numerical optimizations yield an improvement (3.37% absolute), resulting in an absolute gain of 7.77% and 8.77% over the MLP based and energy based methods respectively. ROC based performance evaluation also reveals promising performance for the proposed method, particularly in low SNR conditions. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1007/978-3-540-87391-4_58 | TSD |
Keywords | Field | DocType |
simple energy threshold,promising performance,full-band energy,snd method,exploiting contextual information,absolute gain,snd system,performance evaluation,non-speech detection,large-enough context,absolute performance gain,multi layer perceptron,spectrum,speech detection | Contextual information,Voice activity detection,Computer science,Speech recognition,Absolute gain,Test data,Temporal context,Perceptron,Modulation spectrum | Conference |
Volume | ISSN | Citations |
5246 | 0302-9743 | 2 |
PageRank | References | Authors |
0.45 | 6 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sree Hari Krishnan Parthasarathi | 1 | 124 | 8.30 |
Petr Motlíček | 2 | 13 | 3.36 |
Hynek Hermansky | 3 | 3298 | 510.27 |