Title
A Multi-Classifier Framework for Open Source Malware Forensics
Abstract
Traditional anti-virus technologies have failed to keep pace with proliferation of malware due to slow process of their signatures and heuristics updates. Similarly, there are limitations of time and resources in order to perform manual analysis on each malware. There is a need to learn from this vast quantity of data, containing cyber attack pattern, in an automated manner to proactively adapt to ever-evolving threats. Machine learning offers unique advantages to learn from past cyber attacks to handle future cyber threats. The purpose of this research is to propose a framework for multi-classification of malware into well-known categories by applying different machine learning models over corpus of malware analysis reports. These reports are generated through an open source malware sandbox in an automated manner. We applied extensive pre-modeling techniques for data cleaning, features exploration and features engineering to prepare training and test datasets. Best possible hyper-parameters are selected to build machine learning models. These prepared datasets are then used to train the machine learning classifiers and to compare their prediction accuracy. Finally, these results are validated through a comprehensive 10-fold cross-validation methodology. The best results are achieved through Gaussian Naive Bayes classifier with random accuracy of 96% and 10-Fold Cross Validation accuracy of 91.2%. The said framework can be deployed in an operational environment to learn from malware attacks for proactively adapting matching counter measures.
Year
DOI
Venue
2018
10.1109/WETICE.2018.00027
2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
Keywords
Field
DocType
Cyber Attacks, Malware Forensics, Machine Learning
Sandbox (computer security),Naive Bayes classifier,Cyber-attack,Computer science,Heuristics,Artificial intelligence,Classifier (linguistics),Malware,Cross-validation,Machine learning,Distributed computing,Malware analysis
Conference
ISSN
ISBN
Citations 
1524-4547
978-1-5386-6917-4
0
PageRank 
References 
Authors
0.34
5
4
Name
Order
Citations
PageRank
Naeem Amjad100.34
Hammad Afzal24111.31
M. Faisal Amjad3218.90
Farrukh Aslam Khan438834.17