Title
Tuning for software analytics
Abstract
Context: Data miners have been widely used in software engineering to, say, generate defect predictors from static code measures. Such static code defect predictors perform well compared to manual methods, and they are easy to use and useful to use. But one of the \"black arts\" of data mining is setting the tunings that control the miner.Objective: We seek simple, automatic, and very effective method for finding those tunings.Method: For each experiment with different data sets (from open source JAVA systems), we ran differential evolution as an optimizer to explore the tuning space (as a first step) then tested the tunings using hold-out data.Results: Contrary to our prior expectations, we found these tunings were remarkably simple: it only required tens, not thousands, of attempts to obtain very good results. For example, when learning software defect predictors, this method can quickly find tunings that alter detection precision from 0% to 60%.Conclusion: Since (1)¿the improvements are so large, and (2)¿the tuning is so simple, we need to change standard methods in software analytics. At least for defect prediction, it is no longer enough to just run a data miner and present the result without conducting a tuning optimization study. The implication for other kinds of analytics is now an open and pressing issue.
Year
DOI
Venue
2016
10.1016/j.infsof.2016.04.017
Information and Software Technology
Keywords
DocType
Volume
Defect prediction,CART,Random forest,Differential evolution,Search-based software engineering
Journal
76
Issue
ISSN
Citations 
C
0950-5849
37
PageRank 
References 
Authors
0.69
0
3
Name
Order
Citations
PageRank
Wei Fu11896.04
Tim Menzies22886151.44
Xipeng Shen32025118.55