Title
Predicting Inflectional Paradigms and Lemmata of Unknown Words for Semi-automatic Expansion of Morphological Lexicons.
Abstract
In this paper we describe a semi-automated approach to extend morphological lexicons by defining the prediction of the correct inflectional paradigm and the lemma for an unknown word as a supervised ranking task trained on an already existing lexicon. While most ranking approaches rely only on heuristics based on a single information source, our predictor uses hundreds of features calculated on the candidate stem, corpus evidence and statistics calculated from the existing lexicon. On the example of the Croatian language we show that our approach significantly outperforms a heuristic-based baseline, yielding correct candidates in 77% of cases on the first position and in 95% of cases on the first five positions.
Year
Venue
Field
2015
RANLP
Heuristic,Ranking,Computer science,Lexicon,Heuristics,Natural language processing,Artificial intelligence,Machine learning,Lemma (mathematics)
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
10
4
Name
Order
Citations
PageRank
Nikola Ljubesic18322.19
Miquel Esplà-Gomis26614.79
Filip Klubicka385.21
Nives Mikelic Preradovic452.97