A Close Look at a Daily Dataset of Malware Samples. - Citegraph

Paper Info

Title
A Close Look at a Daily Dataset of Malware Samples.

Abstract
The number of unique malware samples is growing out of control. Over the years, security companies have designed and deployed complex infrastructures to collect and analyze this overwhelming number of samples. As a result, a security company can collect more than 1M unique files per day only from its different feeds. These are automatically stored and processed to extract actionable information derived from static and dynamic analysis. However, only a tiny amount of this data is interesting for security researchers and attracts the interest of a human expert. To the best of our knowledge, nobody has systematically dissected these datasets to precisely understand what they really contain. The security community generally discards the problem because of the alleged prevalence of uninteresting samples. In this article, we guide the reader through a step-by-step analysis of the hundreds of thousands Windows executables collected in one day from these feeds. Our goal is to show how a company can employ existing state-of-the-art techniques to automatically process these samples and then perform manual experiments to understand and document what is the real content of this gigantic dataset. We present the filtering steps, and we discuss in detail how samples can be grouped together according to their behavior to support manual verification. Finally, we use the results of this measurement experiment to provide a rough estimate of both the human and computer resources that are required to get to the bottom of the catch of the day.

Year	DOI	Venue
2018	10.1145/3291061	ACM Trans. Priv. Secur.
Keywords	DocType	Volume
Malware, classification, measurement, prioritization	Journal	22
Issue	ISSN	Citations
1	2471-2566	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xabier Ugarte-Pedrero	1	311	17.43
Mariano Graziano	2	40	5.30
Davide Balzarotti	3	2040	113.64

1