Abstract | ||
---|---|---|
Classification is an important and well-known technique in the field of machine learning, and the training data will significantly influence the classification accuracy. However, the training data in real-world applications often are imbalanced class distribution. It is important to select the suitable training data for classification in the imbalanced class distribution problem. In this paper, we propose a cluster-based sampling approach for selecting the representative data as training data to improve the classification accuracy and investigate the effect of under-sampling methods in the imbalanced class distribution problem. In the experiments, we evaluate the performances for our cluster-based sampling approach and the other sampling methods in the previous studies. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1109/ICSMC.2006.384787 | SMC |
Keywords | Field | DocType |
pattern clustering,pattern classification,backpropagation neural network,backpropagation,sampling method,sampling methods,machine learning,imbalanced data distribution,neural nets | Training set,Data mining,Pattern clustering,Computer science,Sampling (statistics),Artificial intelligence,Backpropagation,Artificial neural network,Machine learning | Conference |
Volume | ISSN | ISBN |
5 | 1062-922X | 1-4244-0100-3 |
Citations | PageRank | References |
14 | 0.85 | 10 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Show-Jane Yen | 1 | 537 | 130.05 |
Yue-Shi Lee | 2 | 543 | 41.14 |
Cheng-Han Lin | 3 | 201 | 16.39 |
Jia-Ching Ying | 4 | 34 | 3.18 |