Title
Attacking DBSCAN for Fun and Profit.
Abstract
Many security applications depend critically on clustering. However, we do not know of any clustering algorithms that were designed with an adversary in mind. An intelligent adversary may be able to use this to her advantage to subvert the security of the application. Already, adversaries use obfuscation and other techniques to alter the representation of their inputs in feature space to avoid detection. As one example, spam email often mimics normal email. In this work, we investigate a more active attack, in which an adversary attempts to subvert clustering analysis by feeding in carefully crafted data points. Specifically, in this work we explore how an attacker can subvert DBSCAN, a popular density-based clustering algorithm. We explore a “confidence attack,” where an adversary seeks to poison the clusters to the point that the defender loses confidence in the utility of the system. This may result in the system being abandoned, or worse, waste the defender’s time investigating false alarms. While our attacks generalize to all DBSCANbased tools, we focus our evaluation on AnDarwin, a tool designed to detect plagiarized Android apps. We show that an adversary can merge arbitrary clusters by connecting them with “bridges”, that even a small number of merges can greatly degrade clustering performance, and that the defender has limited recourse when relying solely on DBSCAN. Finally, we propose a remediation process that uses machine learning and features based on outlier measures that are orthogonal to the underlying clustering problem to detect and remove injected points.
Year
Venue
Field
2015
SDM
Data point,Feature vector,Android (operating system),Computer science,Computer security,Outlier,Artificial intelligence,Adversary,Obfuscation,Cluster analysis,DBSCAN,Machine learning
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
11
2
Name
Order
Citations
PageRank
Jonathan Crussell147117.12
W. Philip Kegelmeyer23498146.54