Title
Quantifying the Impact of Design Strategies for Big Data Cyber Security Analytics: An Empirical Investigation
Abstract
Big Data Cyber Security Analytics (BDCA) systems use big data technologies (e.g., Hadoop and Spark) for collecting, storing, and analyzing a large volume of security event data to detect cyber-attacks. The state-of-the-art uses various design strategies (e.g., feature selection and alert ranking) to help BDCA systems to achieve the desired levels of accuracy and response time. However, the use of these strategies in the state-of-the-art is not consistent, which exposes a lack of consensus on "when to use (and not to use) these design strategies?" In this paper, we follow a systematic experimentation framework to quantify the impact of four design strategies on the accuracy and response time with respect to three contextual factors i.e., security data, machine learning model employed in the system, and the execution mode of the system. For the aimed quantification, we performed experiments on a Hadoop-based BDCA system using four security datasets, five machine learning models, and three execution modes. Our findings lead us to formulate a set of design guidelines that will help researchers and practitioners to decide when to use (and not to use) the design strategies.
Year
DOI
Venue
2019
10.1109/PDCAT46702.2019.00037
2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)
Keywords
DocType
ISSN
big data, cyber security, design strategy, accuracy, response time
Conference
2640-673X
ISBN
Citations 
PageRank 
978-1-7281-2617-3
0
0.34
References 
Authors
13
2
Name
Order
Citations
PageRank
Faheem Ullah171.89
Muhammad Ali Babar22349157.18