Title | ||
---|---|---|
Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia. |
Abstract | ||
---|---|---|
Nowadays, the assessment of the quality and credibility of Wikipedia articles becomes increasingly important. We propose to use morphological and semantic features to estimate the quality of Wikipedia articles in Russian language. We distinguished over 150 linguistic features and divided them into four groups. In these groups, we considered the features of encyclopedic style, readability and subjectivism of the article's text. Based on Random Forest as a classification algorithm, we show the most importance linguistic features that affect the quality of Russian Wikipedia articles. We compare the classification results of our four linguistic features groups separately. We have achieved the F-measure of 89,75%. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1007/978-3-319-67642-5_46 | Communications in Computer and Information Science |
Keywords | Field | DocType |
Quality assessment of texts,Morphological and semantics features,Russian Wikipedia articles,Random forests classification,Encyclopedic,Readability,Subjectivism | Information retrieval,Credibility,Computer science,Readability,Natural language processing,Artificial intelligence,Random forest,Subjectivism | Conference |
Volume | ISSN | Citations |
756 | 1865-0929 | 2 |
PageRank | References | Authors |
0.37 | 10 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wlodzimierz Lewoniewski | 1 | 10 | 1.84 |
Nina Khairova | 2 | 6 | 1.82 |
krzysztof wecel | 3 | 82 | 12.56 |
Nataliia Stratiienko | 4 | 2 | 0.37 |
Witold Abramowicz | 5 | 475 | 64.66 |