Title
An Empirical Assessment of Machine Learning Approaches for Triaging Reports of a Java Static Analysis Tool
Abstract
Despite their ability to detect critical bugs in software, developers consider high false positive rates to be a key barrier to using static analysis tools in practice. To improve the usability of these tools, researchers have recently begun to apply machine learning techniques to classify and filter false positive analysis reports. Although initial results have been promising, the long-term potential and best practices for this line of research are unclear due to the lack of detailed, large-scale empirical evaluation. To partially address this knowledge gap, we present a comparative empirical study of four machine learning techniques, namely hand-engineered features, bag of words, recurrent neural networks, and graph neural networks, for classifying false positives, using multiple ground-truth program sets. We also introduce and evaluate new data preparation routines for recurrent neural networks and node representations for graph neural networks, and show that these routines can have a substantial positive impact on classification accuracy. Overall, our results suggest that recurrent neural networks (which learn over a program's source code) outperform the other subject techniques, although interesting tradeoffs are present among all techniques. Our observations provide insight into the future research needed to speed the adoption of machine learning approaches in practice.
Year
DOI
Venue
2019
10.1109/ICST.2019.00036
2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)
Keywords
Field
DocType
Recurrent neural networks,Tools,Java,Feature extraction,Machine learning,Static analysis
Static program analysis,Computer science,Source code,Usability,Static analysis,Recurrent neural network,Feature extraction,Artificial intelligence,Empirical research,Machine learning,False positive paradox
Conference
ISSN
ISBN
Citations 
2381-2834
978-1-7281-1736-2
1
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Ugur Koc1173.02
Shiyi Wei2434.70
Jeffrey S. Foster32035174.45
Marine Carpuat458751.99
Adam A. Porter57110.05