Abstract | ||
---|---|---|
This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three consortium-defined test sets. It is shown that with these advances, the new system attains a 9% relative reduction in character error rate compared to our previous GALE evaluation system. The reported 9.1% error rate on the phase three evaluation set represents the state of the art in Mandarin broadcast speech transcription. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1109/ICASSP.2010.5495639 | 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING |
Keywords | Field | DocType |
UBM, subspace GMM, speaking rate adaptation, CFRN, speech recognition | Language translation,Normalization (statistics),Subspace topology,Computer science,Word error rate,Speech recognition,Frame rate,Hidden Markov model,Mandarin Chinese,Mixture model | Conference |
ISSN | Citations | PageRank |
1520-6149 | 8 | 0.86 |
References | Authors | |
8 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Stephen M. Chu | 1 | 372 | 26.33 |
Daniel Povey | 2 | 2442 | 231.75 |
Hong-Kwang Kuo | 3 | 71 | 9.60 |
Lidia Mangu | 4 | 1203 | 125.73 |
Shilei Zhang | 5 | 57 | 9.81 |
Qin Shi | 6 | 61 | 10.77 |
Yong Qin | 7 | 161 | 42.54 |