Title
How To Better Distinguish Security Bug Reports (Using Dual Hyperparameter Optimization)
Abstract
Background In order that the general public is not vulnerable to hackers, security bug reports need to be handled by small groups of engineers before being widely discussed. But learning how to distinguish the security bug reports from other bug reports is challenging since they may occur rarely. Data mining methods that can find such scarce targets require extensive optimization effort. Goal The goal of this research is to aid practitioners as they struggle to optimize methods that try to distinguish between rare security bug reports and other bug reports. Method Our proposed method, called SWIFT, is a dual optimizer that optimizes both learner and pre-processor options. Since this is a large space of options, SWIFT uses a technique called 𝜖�-dominance that learns how to avoid operations that do not significantly improve performance. Result When compared to recent state-of-the-art results (from FARSEC which is published in TSE'18), we find that the SWIFT's dual optimization of both pre-processor and learner is more useful than optimizing each of them individually. For example, in a study of security bug reports from the Chromium dataset, the median recalls of FARSEC and SWIFT were 15.7% and 77.4%, respectively. For another example, in experiments with data from the Ambari project, the median recalls improved from 21.5% to 85.7% (FARSEC to SWIFT). Conclusion Overall, our approach can quickly optimize models that achieve better recalls than the prior state-of-the-art. These increases in recall are associated with moderate increases in false positive rates (from 8% to 24%, median). For future work, these results suggest that dual optimization is both practical and useful.
Year
DOI
Venue
2021
10.1007/s10664-020-09906-8
EMPIRICAL SOFTWARE ENGINEERING
Keywords
DocType
Volume
Hyperparameter Optimization, Data pre-processing, Security bug report
Journal
26
Issue
ISSN
Citations 
3
1382-3256
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Rui Shu101.35
Tianpei Xia201.35
Jianfeng Chen3393.51
Laurie Williams44033473.64
Tim Menzies5273.80