Towards Better Text Processing Tools for the Ainu Language. - Citegraph

Paper Info

Title
Towards Better Text Processing Tools for the Ainu Language.

Abstract
In this paper we present our research devoted to the development of Natural Language Processing technologies for the Ainu language, a critically endangered language isolate spoken by the Ainu people, the native inhabitants of northern parts of the Japanese archipelago. In particular, we focused on improving the existing tools for transcription normalization, word segmentation (tokenization) and part-of-speech tagging. In the experiments we applied two Ainu language dictionaries from different domains (literary and colloquial) and created a new data set by combining them. The experiments confirmed the positive effect of these modifications on the overall performance of the tools, especially with objective samples unrelated to the training data. We also discuss further improvements obtained by applying corpus-driven language models to the problem of word segmentation and using a state-of-the-art tool for training part-of-speech taggers.

Year	DOI	Venue
2017	10.1007/978-3-030-66527-2_10	LCT
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Karol Nowakowski	1	0	0.34
Michal Ptaszynski	2	132	25.47
Fumito Masui	3	87	27.22

1