Title | ||
---|---|---|
Recent advances in efficient decoding combining on-line transducer composition and smoothed language model incorporation. |
Abstract | ||
---|---|---|
This paper presents and evaluates our recent efforts on efficient decoding for Large Vocabulary Continuous Speech Recognition in the framework of Weighted Finite State Transducers. We evaluate on-the-fly transducer composition for reduced memory consumption combined with weight smearing for a more time-synchronous language model incorporation. It turns out that in the on-line composition mode weight smoothing within the static part of the network is even more beneficial on run-time to accuracy ratio than in the fully precompiled case. Evaluations are carried out on a state-of-the-art recognition system of 10k words, cross-word triphone acoustic models and trigram language model. In this scenario, the Viterbi-search is carried out fully time-synchronously in only a single pass. The combination of on-the-fly network composition with only the unigram part of the language model smoothly compiled into the network achieves a remarkably good run-time to accuracy ratio with only moderate memory requirements. |
Year | DOI | Venue |
---|---|---|
2002 | 10.1109/ICASSP.2002.5743817 | ICASSP |
Keywords | Field | DocType |
artificial neural networks,argon,hidden markov models,minimization | Triphone,Computer science,Speech recognition,Smoothing,Minification,Decoding methods,Artificial neural network,Hidden Markov model,Vocabulary,Language model | Conference |
Volume | ISSN | ISBN |
1 | 1520-6149 | 0-7803-7402-9 |
Citations | PageRank | References |
14 | 1.65 | 6 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daniel Willett | 1 | 14 | 1.65 |
Shigeru Katagiri | 2 | 850 | 114.01 |