Title | ||
---|---|---|
Fast NMF based approach and VQ based approach using MFCC distance measure for speech recognition from mixed sound. |
Abstract | ||
---|---|---|
We have considered a speech recognition method for mixed sound, consisting of speech and music, that removes only the music based on vector quantization (VQ) and non-negative matrix factorization (NMF). Instead of conventional amplitude spectrum distance measure, MFCC distance measure which is not affected by the pitch is introduced. For isolated word recognition using the clean speech model, an improvement of 53% word error reduction rate was obtained compared with the case of not removing music. Furthermore, a high recognition rate, close to clean speech recognition was obtained at 10dB. For the case of the multi-conditions, our proposed method reduced the error rate of 67% compared with the multi-conditions model. |
Year | Venue | Keywords |
---|---|---|
2013 | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference | music,speech recognition,matrix decomposition |
Field | DocType | ISSN |
Mel-frequency cepstrum,Pattern recognition,Matrix decomposition,Word error rate,Word recognition,Speech recognition,Frequency spectrum,Vector quantization,Non-negative matrix factorization,Artificial intelligence,Mathematics,Acoustic model | Conference | 2309-9402 |
Citations | PageRank | References |
0 | 0.34 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shoichi Nakano | 1 | 3 | 1.13 |
Kazumasa Yamamoto | 2 | 33 | 7.58 |
Seiichi Nakagawa | 3 | 598 | 104.03 |