Title
Iwnlp: Inverse Wiktionary For Natural Language Processing
Abstract
Nowadays, there are a lot of natural language processing pipelines that are based on training data created by a few experts. This paper examines how the proliferation of the internet and its collaborative application possibilities can be practically used for NLP For that purpose, we examine how the German version of Wiktionary can be used for a lemmatization task. We introduce IWNLP, an open-source parser for Wiktionary, that reimplements several MediaWiki markup language templates for conjugated verbs and declined adjectives. The lemmatization task is evaluated on three German corpora on which we compare our results with existing software for lemmatization. With Wiktionary as a resource, we obtain a high accuracy for the lemmatization of nouns and can even improve on the results of existing software for the lemmatization of nouns.
Year
Venue
Field
2015
PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2
Training set,Lemmatisation,Computer science,Noun,Software,Artificial intelligence,Natural language processing,Parsing,German,The Internet,Markup language
DocType
Volume
Citations 
Conference
P15-2
2
PageRank 
References 
Authors
0.43
7
2
Name
Order
Citations
PageRank
Matthias Liebeck121.45
Stefan Conrad2168105.91