Abstract | ||
---|---|---|
Switching between speech coding and generic audio coding schemes was recently proven to be very efficient for coding a large range of audio materials at low bit-rates. However, it strongly relies on a robust classification of the input signal. The aim of the paper is to design a reliable speech and music discriminator (SMD) for such an application. Main attention was laid on getting a good tradeoff between accuracy, reactivity and stability of the decision while keeping the delay and complexity reasonably low. To this end, short-term and long-term features are dissociated before being conveyed to two different classifiers. The two classifier outputs are combined in a final decision using a hysteresis. Objective measures show that a more reliable switching decision is achievable. The SMD was successfully implemented in MPEG Unified Speech and Audio Coding (USAC). It allows the codec to show unprecedented audio quality. |
Year | Venue | Keywords |
---|---|---|
2015 | European Signal Processing Conference | Speech and Music-Discrimination,Speech Coding,Audio Coding |
Field | DocType | ISSN |
Speech coding,Extended Adaptive Multi-Rate – Wideband,Voice activity detection,Computer science,Audio mining,Adaptive Multi-Rate audio codec,Speech recognition,Sub-band coding,Linear predictive coding,Acoustic model | Conference | 2076-1465 |
Citations | PageRank | References |
3 | 0.40 | 6 |
Authors | ||
1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guillaume Fuchs | 1 | 38 | 7.84 |