Title | ||
---|---|---|
On statistical machine translation method for lexicon refinement in speech recognition |
Abstract | ||
---|---|---|
In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned rules from the translation model. In this paper we further this study. First we propose a simple but effective pruning method; second, to see in which case we are able to learn better rules, different setups with various acoustic and language model combinations are investigated; finally, to examine if the rules in different setups are complementary, lexicons generated via different rule tables are merged in ASR experiments. We report a WER reduction of up to 6.2% with the proposed technique. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/ChinaSIP.2015.7230355 | 2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) |
Keywords | Field | DocType |
lexicon learning,grapheme lexicon,statistical machine translation,system fusion,automatic speech recognition | Training set,Data modeling,Grapheme,Computer science,Machine translation,Speech recognition,Lexicon,Natural language processing,Artificial intelligence,Hidden Markov model,Language model | Conference |
Citations | PageRank | References |
0 | 0.34 | 15 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
haihua xu | 1 | 26 | 2.72 |
Xiong Xiao | 2 | 281 | 34.97 |
Eng Siong Chng | 3 | 970 | 106.33 |
Haizhou Li | 4 | 3678 | 334.61 |