Title
Heterogeneous Defect Prediction via Exploiting Correlation Subspace.
Abstract
Software defect prediction generally builds models from intra-project data. Lack of training data at the early stage of software testing limits the efficiency of prediction in practice. Thereby researchers proposed cross-project defect prediction using the data from other projects. Most previous efforts assumed the cross-project defect data have the same metrics set which means the metrics used and size of metrics set are same in the data of projects. However, in real scenarios, this assumption may not hold. In addition, software defect datasets have the class imbalance problem increasing the difficulty for the learner to predict defects. In this paper, we advance canonical correlation analysis for deriving a joint feature space for associating crossproject data and propose a novel support vector machine algorithm which incorporates the correlation transfer information into classifier design for cross-project prediction. Moreover, we take different misclassification costs into consideration to make the classification inclining to classify a module as a defective one, alleviating the impact of imbalanced data. Experiments on public heterogeneous datasets from different projects show that our method is more effective, compared to state-of-the-art methods. Keywords-defect prediction; heterogeneous metrics; class imbalance; canonical correlation analysis; support vector machine
Year
Venue
Field
2016
SEKE
Training set,Data mining,Feature vector,Subspace topology,Computer science,Canonical correlation,Software bug,Support vector machine,Correlation,Classifier (linguistics)
DocType
Citations 
PageRank 
Conference
3
0.37
References 
Authors
17
6
Name
Order
Citations
PageRank
Ming Cheng131.04
Guoqing Wu23814.26
Min Jiang374.15
Hongyan Wan463.46
Guoan You540.71
Mengting Yuan650.72