The Effect of Aggregating Subtype Performances Depends Strongly on the Performance Measure Used - Citegraph

Paper Info

Title
The Effect of Aggregating Subtype Performances Depends Strongly on the Performance Measure Used

Abstract
For some classification tasks the data can be partitioned into disjoint subsets based on some attribute, for example a disease subtype. It then seems logical to train a classifier with the same classes as the original classification problem for each subtype separately, such that the performance per subtype is optimized. Unfortunately, the influence of the subtype performances on the aggregated overall performance depends strongly on the performance measure used and can be very counterintuitive. We show that for some performance measures (e.g., classification accuracy, precision, recall, Fi) the aggregated performance is a simple linear combination of subtype performances. In these cases, improving the performance of a subtype-specific classifier implies that the overall performance improves. However, for other performance measures (e.g., balanced accuracy rate, area under the ROC curve) and also for performance measures in survival analysis (concordance index), additional cross terms appear in the aggregation of the subtype performances. These cross terms are heavily dependent on both the overall class imbalance and the subtype class imbalances. For these measures, improving subtype performances may actually result in a decrease of the overall performance.

Year	DOI	Venue
2014	10.1109/ICPR.2014.639	Pattern Recognition
Keywords	DocType	ISSN
pattern classification,ROC curve,classification accuracy,classification problem,data classification tasks,disease subtype,disjoint subsets,performance measure,subtype class imbalances,subtype performances aggregation,subtype-specific classifier,AUC,Classifier performance evaluation,balanced accuracy rate,class imbalance,error decomposition	Conference	1051-4651
Citations	PageRank	References
0	0.34	0
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
David M. J. Tax	1	261	15.58
Herman M. J. Sontrop	2	20	0.89
Reinders, M.J.T.	3	343	20.33
Perry D. Moerland	4	31	3.30

1