Attacking DBSCAN for Fun and Profit. - Citegraph

Paper Info

Title
Attacking DBSCAN for Fun and Profit.

Abstract
Many security applications depend critically on clustering. However, we do not know of any clustering algorithms that were designed with an adversary in mind. An intelligent adversary may be able to use this to her advantage to subvert the security of the application. Already, adversaries use obfuscation and other techniques to alter the representation of their inputs in feature space to avoid detection. As one example, spam email often mimics normal email. In this work, we investigate a more active attack, in which an adversary attempts to subvert clustering analysis by feeding in carefully crafted data points. Specifically, in this work we explore how an attacker can subvert DBSCAN, a popular density-based clustering algorithm. We explore a “confidence attack,” where an adversary seeks to poison the clusters to the point that the defender loses confidence in the utility of the system. This may result in the system being abandoned, or worse, waste the defender’s time investigating false alarms. While our attacks generalize to all DBSCANbased tools, we focus our evaluation on AnDarwin, a tool designed to detect plagiarized Android apps. We show that an adversary can merge arbitrary clusters by connecting them with “bridges”, that even a small number of merges can greatly degrade clustering performance, and that the defender has limited recourse when relying solely on DBSCAN. Finally, we propose a remediation process that uses machine learning and features based on outlier measures that are orthogonal to the underlying clustering problem to detect and remove injected points.

Year	Venue	Field
2015	SDM	Data point,Feature vector,Android (operating system),Computer science,Computer security,Outlier,Artificial intelligence,Adversary,Obfuscation,Cluster analysis,DBSCAN,Machine learning
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
11	2

Authors (2 rows)

Cited by (0 rows)

References (11 rows)

Name	Order	Citations	PageRank
Jonathan Crussell	1	471	17.12
W. Philip Kegelmeyer	2	3498	146.54

1