Abstract | ||
---|---|---|
With the wider deployment of automatic speech recognition (ASR) systems, the importance of robust speech activity de- tection has been elevated both as a means of reducing band- width in client/server ASR and for overall system stability from barge-in through the recognition process. In this paper we in- vestigate a novel technique for speech activity detection, that we have found to be effective in handling non-stationary noise events without negatively impacting the recognition process. This technique is based on combining acoustic phonetic like- lihood based features with energy features extracted from the signal waveform. Reported results on two speech activity de- tection tasks demonstrate that the proposed method outperforms techniques which rely solely on acoustic or energy features. |
Year | Venue | Keywords |
---|---|---|
2005 | INTERSPEECH | automatic speech recognition,speech activity detection,client server,feature extraction |
Field | DocType | Citations |
Speech processing,Pattern recognition,Voice activity detection,Computer science,Speech recognition,Artificial intelligence,Acoustic model | Conference | 3 |
PageRank | References | Authors |
0.45 | 4 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Etienne Marcheret | 1 | 100 | 11.15 |
Karthik Visweswariah | 2 | 400 | 38.22 |
Gerasimos Potamianos | 3 | 1113 | 113.80 |