Abstract | ||
---|---|---|
Program analysis is a technique to reason about programs without executing them, and it has various applications in compilers, integrated development environments, and security. In this work, we present a machine learning pipeline that induces a security analyzer for programs by example. The security analyzer determines whether a program is either secure or insecure based on symbolic rules that were deduced by our machine learning pipeline. The machine pipeline is two-staged consisting of a Recurrent Neural Networks (RNN) and an Extractor that converts an RNN to symbolic rules. To evaluate the quality of the learned symbolic rules, we propose a sampling-based similarity measurement between two infinite regular languages. We conduct a case study using real-world data. In this work, we discuss the limitations of existing techniques and possible improvements in the future. The results show that with sufficient training data and a fair distribution of program paths it is feasible to deducing symbolic security rules for the OpenJDK library with millions lines of code. |
Year | Venue | Field |
---|---|---|
2017 | arXiv: Programming Languages | Static program analysis,Spark (mathematics),Programming language,Computer science,Recurrent neural network,Theoretical computer science,Compiler,Sampling (statistics),Program analysis,Regular language,Source lines of code |
DocType | Volume | Citations |
Journal | abs/1711.01024 | 0 |
PageRank | References | Authors |
0.34 | 14 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wasuwee Sodsong | 1 | 5 | 1.86 |
Bernhard Scholz | 2 | 104 | 10.59 |
Sanjay Chawla | 3 | 1372 | 105.09 |