Abstract | ||
---|---|---|
This paper presents the methods to improve the performance of mispronunciation detection at syllable level for Mandarin from two aspects: proposing scaled log-posterior probability (SLPP) and weighted phone SLPP to get the better measure of pronunciation quality; introducing speaker normalization of speaker adaptive training (SAT) and speaker adaptation of selective maximum likelihood linear regression (SMLLR) to get a better statistical model. Experiments based on a database, consisting of 8000 syllables pronounced by 40 speakers with varied pronunciation proficiency, confirm the promising effectiveness of these strategies by reducing FAR from 41.1% to 31.4% at 90% FRR and 36.0% to 16.3%at 95%FRR. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1109/ICASSP.2008.4518800 | ICASSP |
Keywords | Field | DocType |
speech processing,log-posterior probability,automatic mispronunciation detection (amd),regression analysis,maximum likelihood estimation,speaker adaptive training (sat),selective maximum likelihood linear regression (smllr),pronunciation quality,mandarin,selective maximum likelihood linear regression,natural language processing,speaker adaptive training,automatic mispronunciation detection,scaled log-posterior probability,speaker adaptation,probability,weighted phone slpp,statistical model,posterior probability | Pronunciation,Speech processing,Normalization (statistics),Pattern recognition,Regression analysis,Computer science,Speech recognition,Statistical model,Syllable,Artificial intelligence,Mandarin Chinese,Speaker adaptation | Conference |
ISSN | ISBN | Citations |
1520-6149 E-ISBN : 978-1-4244-1484-0 | 978-1-4244-1484-0 | 9 |
PageRank | References | Authors |
0.94 | 5 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Feng Zhang | 1 | 11 | 1.80 |
Chao Huang | 2 | 218 | 23.06 |
Frank K. Soong | 3 | 1395 | 268.29 |
Min Chu | 4 | 316 | 32.29 |
Ren-Hua Wang | 5 | 344 | 41.36 |