Title
Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia.
Abstract
Nowadays, the assessment of the quality and credibility of Wikipedia articles becomes increasingly important. We propose to use morphological and semantic features to estimate the quality of Wikipedia articles in Russian language. We distinguished over 150 linguistic features and divided them into four groups. In these groups, we considered the features of encyclopedic style, readability and subjectivism of the article's text. Based on Random Forest as a classification algorithm, we show the most importance linguistic features that affect the quality of Russian Wikipedia articles. We compare the classification results of our four linguistic features groups separately. We have achieved the F-measure of 89,75%.
Year
DOI
Venue
2017
10.1007/978-3-319-67642-5_46
Communications in Computer and Information Science
Keywords
Field
DocType
Quality assessment of texts,Morphological and semantics features,Russian Wikipedia articles,Random forests classification,Encyclopedic,Readability,Subjectivism
Information retrieval,Credibility,Computer science,Readability,Natural language processing,Artificial intelligence,Random forest,Subjectivism
Conference
Volume
ISSN
Citations 
756
1865-0929
2
PageRank 
References 
Authors
0.37
10
5
Name
Order
Citations
PageRank
Wlodzimierz Lewoniewski1101.84
Nina Khairova261.82
krzysztof wecel38212.56
Nataliia Stratiienko420.37
Witold Abramowicz547564.66