Title
ScriptNet: Neural Static Analysis for Malicious JavaScript Detection
Abstract
Malicious scripts are an important computer infection threat vector for computer users. For internet-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We also propose a novel deep learning model, Pre-Informant Learning (PIL), which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing PIL model offers a 98.10% true positive rate (TPR) for the first 60K byte subsequences and 81.66% for the full-length files, at a false positive rate (FPR) of 0.50%. Both models significantly outperform several baseline models. The best performing PIL model can successfully detect 92.02% of unknown malware samples in a hindsight experiment where the true labels of the malicious JavaScript files were not known when the model was trained.
Year
DOI
Venue
2019
10.1109/MILCOM47813.2019.9020870
MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)
Keywords
Field
DocType
Malware detection,Neural models,LSTM
Byte,False positive rate,Embedding,Pattern recognition,Computer science,Static analysis,Theoretical computer science,Artificial intelligence,Subsequence,Discriminative model,Scripting language,JavaScript
Journal
Volume
ISSN
ISBN
abs/1904.01126
2155-7578
978-1-7281-4281-4
Citations 
PageRank 
References 
0
0.34
10
Authors
4
Name
Order
Citations
PageRank
Jack W. Stokes119920.85
Rakshit Agrawal257.44
Geoff McDonald300.34
Matthew J. Hausknecht400.34