Abstract | ||
---|---|---|
In this paper we present an efficient and flexible approach to VTLN warping factor estimation. Due to the equivalence of frequency warping and linear transformation of cepstral coefficients, warping factors can be efficiently estimated by accumulating the sufficient statistics for linear transformation estimation, and searching the constrained space of transformations given by the explicit mapping between warping factors and linear transformation matrices. We show that the positive effect of using a properly normalized optimization criterion for warping factor estimation, which has been previously demonstrated for a signal analysis front-end without a filter-bank, carries over to a MFCC front-end, resulting in a net improvement in word error rate. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1109/ICASSP.2006.1660242 | 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 |
Keywords | Field | DocType |
speech recognition,mel frequency cepstral coefficient,word error rate,statistics,automatic speech recognition,linear transformation,front end,signal analysis,sufficient statistic,filter bank,cepstrum | Signal processing,Mel-frequency cepstrum,Image warping,Normalization (statistics),Pattern recognition,Matrix (mathematics),Word error rate,Linear map,Artificial intelligence,Sufficient statistic,Mathematics | Conference |
ISSN | Citations | PageRank |
1520-6149 | 3 | 0.41 |
References | Authors | |
7 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jonas Lööf | 1 | 81 | 5.81 |
Hermann Ney | 2 | 14178 | 1506.93 |
Srinivasan Umesh | 3 | 93 | 16.31 |