Abstract | ||
---|---|---|
We observed that human listeners distinguish one dialect from another by paying special attention to some particular phonetic and/or phonotactic patterns. Motivated by this observation, we propose a technique that emulates this process. We explore a target-aware lattice rescoring (TALR) process that revises the n-gram statistics in a lattice with target dialect information. We then derive n-gram statistics as the phonotactic features from the lattice and develop a system under the vector space modeling framework. The experiment results show that the proposed technique consistently improves dialect recognition performance on 30-second test utterances. We achieved equal error rates (EERs) of 4.57% and 13.28% with 3-gram statistics for Chinese and English dialect recognition in 2007 NIST Language Recognition Evaluation 30-second closed test sets. |
Year | Venue | Keywords |
---|---|---|
2011 | 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | speech recognition, dialect recognition, spoken language recognition, lattice rescore, language model |
Field | DocType | Citations |
Lattice (order),Computer science,Speech recognition | Conference | 1 |
PageRank | References | Authors |
0.36 | 1 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rong Tong | 1 | 108 | 11.33 |
Bin Ma | 2 | 64 | 6.11 |
Haizhou Li | 3 | 3678 | 334.61 |
Eng Siong Chng | 4 | 970 | 106.33 |