Title
Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis
Abstract
In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.
Year
DOI
Venue
2013
10.1109/ICSE.2013.6606610
ICSE
Keywords
Field
DocType
previous work,hybrid attributes,pattern clustering,supervised vulnerability predictors,data privacy,security,static analysis-based prediction,pattern classification,known vulnerability,defect prediction,sql injection mining,input validation and sanitization,static and dynamic analysis,privacy,vulnerability,hybrid program analysis,proposed static attribute,dynamic attribute,static attribute,static analysis-based predictions,input sanitization code pattern,cross site,program diagnostics,data mining,classification,prediction models,mining sql injection,input validation,initial work,empirical study,static attributes,security of data,training data,sql,dynamic attributes,cross site scripting vulnerabilities,clustering,predictive models,supervised learning,html,databases
SQL,Data mining,Data validation,Computer science,Static analysis,Supervised learning,Cross-site scripting,Program analysis,Cluster analysis,SQL injection
Conference
Volume
ISBN
Citations 
2
978-1-4673-3073-2
35
PageRank 
References 
Authors
1.26
23
3
Name
Order
Citations
PageRank
Lwin Khin Shar118014.56
Hee Beng Kuan Tan248945.05
Lionel C. Briand38795481.98