Title
Exploiting Contextual Information for Speech/Non-Speech Detection
Abstract
In this paper, we investigate the effect of temporal context for speech/ non-speech detection (SND). It is shown that even a simple feature such as full-band energy, when employed with a large-enough context, shows promise for further investigation. Experimental evaluations on the test data set, with a state-of-the-art multi-layer perceptron based SND system and a simple energy threshold based SND method, using the F-measure, show an absolute performance gain of 4.4% and 5.4% respectively. The optimal contextual length was found to be 1000 ms. Further numerical optimizations yield an improvement (3.37% absolute), resulting in an absolute gain of 7.77% and 8.77% over the MLP based and energy based methods respectively. ROC based performance evaluation also reveals promising performance for the proposed method, particularly in low SNR conditions.
Year
DOI
Venue
2008
10.1007/978-3-540-87391-4_58
TSD
Keywords
Field
DocType
simple energy threshold,promising performance,full-band energy,snd method,exploiting contextual information,absolute gain,snd system,performance evaluation,non-speech detection,large-enough context,absolute performance gain,multi layer perceptron,spectrum,speech detection
Contextual information,Voice activity detection,Computer science,Speech recognition,Absolute gain,Test data,Temporal context,Perceptron,Modulation spectrum
Conference
Volume
ISSN
Citations 
5246
0302-9743
2
PageRank 
References 
Authors
0.45
6
3
Name
Order
Citations
PageRank
Sree Hari Krishnan Parthasarathi11248.30
Petr Motlíček2133.36
Hynek Hermansky33298510.27