Title
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations
Abstract
We describe a content-based audio classification algorithm based on novel multiscale spectro-temporal modulation features inspired by a model of auditory cortical processing. The task explored is to discriminate speech from nonspeech consisting of animal vocalizations, music, and environmental sounds. Although this is a relatively easy task for humans, it is still difficult to automate well, especially in noisy and reverberant environments. The auditory model captures basic processes occurring from the early cochlear stages to the central cortical areas. The model generates a multidimensional spectro-temporal representation of the sound, which is then analyzed by a multilinear dimensionality reduction technique and classified by a support vector machine (SVM). Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches (Scheirer and Slaney, 2002 and Kingsbury et al., 2002). The results demonstrate the advantages of the auditory model over the other two systems, especially at low signal-to-noise ratios (SNRs) and high reverberation.
Year
DOI
Venue
2006
10.1109/TSA.2005.858055
Audio, Speech, and Language Processing, IEEE Transactions
Keywords
Field
DocType
audio signal processing,modulation,speech processing,support vector machines,SVM,auditory cortical processing,content-based audio classification,multidimensional spectro-temporal representation,multilinear dimensionality reduction technique,multiscale spectro-temporal modulations,nonspeech,speech discrimination,support vector machine,Audio classification and segmentation,auditory model,speech discrimination
Noise,Speech processing,Audio signal,Reverberation,Dimensionality reduction,Pattern recognition,Computer science,Support vector machine,Speech recognition,Speech discrimination,Artificial intelligence,Audio signal processing
Journal
Volume
Issue
ISSN
14
3
1558-7916
Citations 
PageRank 
References 
86
4.27
23
Authors
3
Name
Order
Citations
PageRank
Nima Mesgarani125622.43
Malcolm Slaney21797212.76
Shihab Shamma355467.25