Abstract | ||
---|---|---|
Efficient large vocabulary continuous speech recognition of morphologically rich languages is a big challenge due to the rapid vocabulary growth. To improve the results various subword units called as morphs are applied as basic language elements. The improvements over the word baseline, however, are changing from negative to error rate halving across languages and tasks. In this paper we make an attempt to explore the source of this variability. Different LVCSR tasks of an agglutinative language are investigated in numerous experiments using full vocabularies. The improvement results are compared to pre-existing other language results, as well. Important correlations are found between the morph-based improvements and between the vocabulary growths and the corpus sizes. Index Terms — speech recognition, rich morphology, morph, language modeling, LVCSR |
Year | Venue | Field |
---|---|---|
2010 | SLTU | Computer science,Word error rate,Agglutinative language,Natural language processing,Artificial intelligence,Vocabulary,Language model |
DocType | Citations | PageRank |
Conference | 6 | 0.52 |
References | Authors | |
12 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Balázs Tarján | 1 | 21 | 4.92 |
Péter Mihajlik | 2 | 58 | 10.15 |