Title
Detecting Malicious URLs: A Semi-Supervised Machine Learning System Approach
Abstract
As malware industry grows, so does the means of infecting a computer or device evolve. One of the most common infection vector is to use the Internet as an entry point. Not only that this method is easy to use, but due to the fact that URLs come in different forms and shapes, it is really difficult to distinguish a malicious URL from a benign one. Furthermore, every system that tries to classify or detect URLs must work on a real time stream and needs to provide a fast response for every URL that is submitted for analysis (in our context a fast response means less than 300-400 milliseconds/URL). From a malware creator point of view, it is really easy to change such URLs multiple times in one day. As a general observation, malicious URLs tend to have a short life (they appear, serve malicious content for several hours and then they are shut down usually by the ISP where they reside in). This paper aims to present a system that analyzes URLs in network traffic that is also capable of adjusting its detection models to adapt to new malicious content. Every correctly classified URL is reused as part of a new dataset that acts as the backbone for new detection models. The system also uses different clustering techniques in order to identify the lack of features on malicious URLs, thus creating a way to improve detection for this kind of threats.
Year
DOI
Venue
2016
10.1109/SYNASC.2016.045
2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)
Keywords
Field
DocType
malicious URLs,semi-supervised learning,big data,data streams
World Wide Web,Information retrieval,Computer science,Entry point,URL normalization,Feature extraction,Theoretical computer science,Semantic URL,Cluster analysis,Malware,Statistical classification,The Internet
Conference
ISSN
ISBN
Citations 
2470-881X
978-1-5090-5708-5
1
PageRank 
References 
Authors
0.34
4
4
Name
Order
Citations
PageRank
Anton Dan Gabriel110.34
Dragos Gavrilut2627.95
Baetu Ioan Alexandru310.34
Adrian Popescu455031.79