Title
Heterogeneous defect prediction with two-stage ensemble learning.
Abstract
Heterogeneous defect prediction (HDP) refers to predicting defect-prone software modules in one project (target) using heterogeneous data collected from other projects (source). Recently, several HDP methods have been proposed. However, these methods do not sufficiently incorporate the two characteristics of the defect data: (1) data could be linear inseparable, and (2) data could be highly imbalanced. These two data characteristics make it challenging to build an effective HDP model. In this paper, we propose a novel Two-Stage Ensemble Learning (TSEL) approach to HDP, which contains two stages: ensemble multi-kernel domain adaptation (EMDA) stage and ensemble data sampling (EDS) stage. In the EMDA stage, we develop an Ensemble Multiple Kernel Correlation Alignment (EMKCA) predictor, which combines the advantage of multiple kernel learning and domain adaptation techniques. In the EDS stage, we employ RESample with replacement (RES) technique to learn multiple different EMKCA predictors and use average ensemble to combine them together. These two stages create an ensemble of defect predictors. Extensive experiments on 30 public projects show that the proposed TSEL approach outperforms a range of competing methods. The improvement is 20.14–33.92% in AUC, 36.05–54.78% in f-measure, and 5.48–19.93% in balance, respectively.
Year
DOI
Venue
2019
10.1007/s10515-019-00259-1
Automated Software Engineering
Keywords
Field
DocType
Heterogeneous defect prediction, Two-stage ensemble learning, Linear inseparability, Multiple kernel learning, Class imbalance, Data sampling, Domain adaptation
Kernel (linear algebra),Software modules,Computer science,Domain adaptation,Multiple kernel learning,Theoretical computer science,Correlation,Artificial intelligence,Data sampling,Ensemble learning,Machine learning
Journal
Volume
Issue
ISSN
26
3
0928-8910
Citations 
PageRank 
References 
3
0.35
37
Authors
6
Name
Order
Citations
PageRank
Zhiqiang Li1443.41
Xiao-Yuan Jing276955.18
Xiaoke Zhu3787.77
Hongyu Zhang486450.03
Xu, Baowen52476165.27
Shi Ying633431.11