Title
An empirical framework for defect prediction using machine learning techniques with Android software.
Abstract
Display Omitted Use of appropriate and large number data sets for comparing 18 ML techniques for defect prediction using object-oriented metrics.Effective performance of the predicted models assessed using appropriate performance measures.Reliability of the results evaluated using statistical test and post-hoc analysis.Validating the predicted models using inter-release validation on various releases of seven application packages of Android software. ContextSoftware defect prediction is important for identification of defect-prone parts of a software. Defect prediction models can be developed using software metrics in combination with defect data for predicting defective classes. Various studies have been conducted to find the relationship between software metrics and defect proneness, but there are few studies that statistically determine the effectiveness of the results. ObjectiveThe main objectives of the study are (i) comparison of the machine-learning techniques using data sets obtained from popular open source software (ii) use of appropriate performance measures for measuring the performance of defect prediction models (iii) use of statistical tests for effective comparison of machine-learning techniques and (iv) validation of models over different releases of data sets. MethodIn this study we use object-oriented metrics for predicting defective classes using 18 machine-learning techniques. The proposed framework has been applied to seven application packages of well known, widely used Android operating system viz. Contact, MMS, Bluetooth, Email, Calendar, Gallery2 and Telephony. The results are validated using 10-fold and inter-release validation methods. The reliability and significance of the results are evaluated using statistical test and post-hoc analysis. ResultsThe results show that the area under the curve measure for Nave Bayes, LogitBoost and Multilayer Perceptron is above 0.7 in most of the cases. The results also depict that the difference between the ML techniques is statistically significant. However, it is also proved that the Support Vector Machines based techniques such as Support Vector Machines and voted perceptron do not possess the predictive capability for predicting defects. ConclusionThe results confirm the predictive capability of various ML techniques for developing defect prediction models. The results also confirm the superiority of one ML technique over the other ML techniques. Thus, the software engineers can use the results obtained from this study in the early phases of the software development for identifying defect-prone classes of given software.
Year
DOI
Venue
2016
10.1016/j.asoc.2016.04.032
Appl. Soft Comput.
Keywords
Field
DocType
Object-oriented metrics,Machine-learning,Software defect proneness,Statistical tests,Inter-release validation
Data mining,Data set,Computer science,Multilayer perceptron,Software,Artificial intelligence,LogitBoost,Software metric,Perceptron,Machine learning,Statistical hypothesis testing,Software development
Journal
Volume
Issue
ISSN
49
C
1568-4946
Citations 
PageRank 
References 
6
0.48
0
Authors
1
Name
Order
Citations
PageRank
Ruchika Malhotra153335.12