A bilingual study on the prediction of morph-based improvement. - Citegraph

Paper Info

Title
A bilingual study on the prediction of morph-based improvement.

Abstract
Morph-based language modeling has been efficiently applied in improving the accuracy of Large-Vocabulary Continuous Speech Recognition (LVCSR) systems – especially in morphologically rich languages. However, the rate of improvements varies greatly and the underlying principles have been only superficially studied. Having a method that can predict the expected improvement prior to experimentations would be largely useful. In this paper, we introduce language-independent factors affecting morphbased improvement and show how they can be utilized in estimating the effectiveness of statistical morph-based language modeling. The task was broadcast news transcription in two less investigated languages, Hungarian and Romanian. It was found that in case of under-resourced conditions morph-based models can bring significant improvement – even for a morphologically less rich language like Romanian. In addition, it was shown that noninitial morph tagging can constantly outperform explicit modeling of word-boundaries both in terms of letter and word accuracies.

Year	Venue	Field
2014	SLTU	Broadcasting,Romanian,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Language model
DocType	Citations	PageRank
Conference	2	0.40
References	Authors
8	3

Authors (3 rows)

Cited by (2 rows)

References (8 rows)

Name	Order	Citations	PageRank
Balázs Tarján	1	21	4.92
Tibor Fegyó	2	61	10.46
Péter Mihajlik	3	58	10.15

1