Abstract | ||
---|---|---|
In this work, a combined strategy is proposed to solve the unbalance problem in the classification of the short-text data sets. The improved K-means sampling method and category guide words are used to improve the classification accuracy of unbalanced data, and then VSM(vector space method) is used to express text. Finally, Naive Bayesian classifiers are used to classify the unbalanced short-text. Experiments show that this method is effective and feasible in the classification of small class events in unbalanced short-text data. The method can improve the small class classification accuracy and provide the decision basis for the government to respond quickly and precisely to emergencies. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/ICBK.2017.21 | 2017 IEEE International Conference on Big Knowledge (ICBK) |
Keywords | Field | DocType |
Unbalanced short-text,Category guide word,Text classification | Data mining,Hotline,Vector space,Data set,Naive Bayes classifier,Computer science,Sampling (statistics),Artificial intelligence,Machine learning | Conference |
ISBN | Citations | PageRank |
978-1-5386-3121-8 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Li-Yuan Geng | 1 | 0 | 0.34 |
Wei Jin | 2 | 83 | 25.25 |
Han-Bing Qu | 3 | 0 | 0.34 |