Title
On statistical machine translation method for lexicon refinement in speech recognition
Abstract
In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned rules from the translation model. In this paper we further this study. First we propose a simple but effective pruning method; second, to see in which case we are able to learn better rules, different setups with various acoustic and language model combinations are investigated; finally, to examine if the rules in different setups are complementary, lexicons generated via different rule tables are merged in ASR experiments. We report a WER reduction of up to 6.2% with the proposed technique.
Year
DOI
Venue
2015
10.1109/ChinaSIP.2015.7230355
2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)
Keywords
Field
DocType
lexicon learning,grapheme lexicon,statistical machine translation,system fusion,automatic speech recognition
Training set,Data modeling,Grapheme,Computer science,Machine translation,Speech recognition,Lexicon,Natural language processing,Artificial intelligence,Hidden Markov model,Language model
Conference
Citations 
PageRank 
References 
0
0.34
15
Authors
4
Name
Order
Citations
PageRank
haihua xu1262.72
Xiong Xiao228134.97
Eng Siong Chng3970106.33
Haizhou Li43678334.61