Abstract | ||
---|---|---|
Classification of data with imbalanced class distribution has posed a significant drawback of the performance attainable by most standard classifier learning algorithms, which assume a relatively balanced class distribution and equal misclassification costs. This learning difficulty attracts a lot of research interests. Most efforts concentrate on bi-class problems. However, bi-class is not the only scenario where the class imbalance problem prevails. Reported solutions for bi-class applications are not applicable to multi-class problems. In this paper, we develop a cost-sensitive boosting algorithm to improve the classification performance of imbalanced data involving multiple classes. One barrier of applying the cost-sensitive boosting algorithm to the imbalanced data is that the cost matrix is often unavailable for a problem domain. To solve this problem, we apply Genetic Algorithm to search the optimum cost setup of each class. Empirical tests show that the proposed cost-sensitive boosting algorithm improves the classification performances of imbalanced data sets significantly. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1109/ICDM.2006.29 | ICDM |
Keywords | Field | DocType |
learning multiple classes,problem domain,bi-class application,cost-sensitive boosting algorithm,bi-class problem,learning (artificial intelligence),pattern classification,multi-class problem,imbalanced class distribution,data classification,balanced class distribution,boosting algorithm,class imbalance problem,multiple class,classifier learning algorithm,genetic algorithm,genetic algorithms,data mining,multiple classes imbalance learning,imbalanced data,classification performance,learning artificial intelligence | Drawback,Data mining,Data set,Cost matrix,Problem domain,Computer science,Boosting (machine learning),Artificial intelligence,Data classification,Classifier (linguistics),Genetic algorithm,Machine learning | Conference |
ISSN | ISBN | Citations |
1550-4786 | 0-7695-2701-7 | 91 |
PageRank | References | Authors |
2.64 | 17 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanmin Sun | 1 | 770 | 21.67 |
Mohamed S. Kamel | 2 | 4523 | 282.55 |
Yang Wang | 3 | 948 | 155.42 |