Abstract | ||
---|---|---|
Rare-category detection helps discover new rare classes in an unlabeled data set by selecting their candidate data examples for labeling. Most of the existing approaches for rare-category detection require prior information about the data set without which they are otherwise not applicable. The prior-free algorithms try to address this problem without prior information about the data set; though, the compensation is high time complexity, which is not lower than O(dN2) where N is the number of data examples in a data set and d is the data set dimension. In this paper, we propose CLOVER a prior-free algorithm by introducing a novel rare-category criterion known as local variation degree (LVD), which utilizes the characteristics of rare classes for identifying rare-class data examples from other types of data examples and passes those data examples with maximum LVD values to CLOVER for labeling. A remarkable improvement is that CLOVER's time complexity is O(dN2-1/d) for d > 1 or O(N log N) for d = 1. Extensive experimental results on real data sets demonstrate the effectiveness and efficiency of our method in terms of new rare classes discovery and lower time complexity. © 2012 Springer-Verlag London Limited. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1007/s10115-012-0530-9 | Knowl. Inf. Syst. |
Keywords | Field | DocType |
κnn,histogram density estimation,local variation degree,mκnn,rare-category detection | Data mining,Binary logarithm,Data set,Algorithm,Data type,Artificial intelligence,Time complexity,Machine learning,Mathematics | Journal |
Volume | Issue | ISSN |
35 | 3 | 02193116 |
Citations | PageRank | References |
12 | 0.61 | 33 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hao Huang | 1 | 89 | 7.77 |
Qinming He | 2 | 371 | 41.53 |
Kevin Chiew | 3 | 116 | 11.06 |
Feng Qian | 4 | 56 | 4.26 |
Lianhang Ma | 5 | 58 | 3.96 |