Title
Speech detection on broadcast audio
Abstract
Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Nonspeech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).
Year
Venue
Field
2010
European Signal Processing Conference
Speech processing,Mel-frequency cepstrum,Speech coding,Pattern recognition,Voice activity detection,Computer science,Speech recognition,Speaker recognition,Artificial intelligence,Codec2,Linear predictive coding,Acoustic model
DocType
ISSN
Citations 
Conference
2219-5491
4
PageRank 
References 
Authors
0.48
9
7
Name
Order
Citations
PageRank
Unal Zubari1251.94
ezgi can ozan2184.44
Banu Oskay Acar3273.01
Tolga Ciloglu4342.87
Ersin Esen59214.15
Tugrul K. Ates6353.70
Duygu Oskay Önür760.87