Title
The Text Classification Based on Big Data Analysis for Keyword Definition Using Stemming
Abstract
Software for steaming Ukrainian-language texts has been developed and implemented, and methods for classifying texts written in Ukrainian using the Porter algorithm. The software product is made in the Python programming language, using the NLTK library. An analysis of existing methods such as classification, clustering and others was performed. Methods of vectorisation of text data and patterns of keeping the dictionary have been considered. Moreover, information about previously analysed data has been saved.
Year
DOI
Venue
2021
10.1109/CSIT52700.2021.9648764
2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT)
Keywords
DocType
Volume
stemming,lemmatisation,neural network,Bayesian classifier,python programming language,word model,natural language,Ukrainian texts,classification,clustering,Python,NLTK,text classification
Conference
1
ISSN
ISBN
Citations 
2766-3655
978-1-6654-4258-9
0
PageRank 
References 
Authors
0.34
2
5
Name
Order
Citations
PageRank
Andrii Berko101.01
Yurii Matseliukh200.34
Yurii Ivaniv300.34
Lyubomyr Chyrun401.35
Vadim Schuchmann501.35