Abstract | ||
---|---|---|
Why is a given node in a time-evolving graph ($t$-graph) marked as an anomaly by an off-the-shelf detection algorithm? Is it because of the number of its outgoing or incoming edges, or their timings? How can we convince a human analyst that the node is anomalous? Our work aims to provide succinct, interpretable, and simple explanations of anomalous behavior in $t$-graphs (communications, IP-IP interactions, etc.) while respecting the limited attention of human analysts. Specifically, we extract key features from such graphs, and propose to output a few pair (scatter) plots from this feature space which best explain known anomalies. To this end, our work has four main contributions: (a) problem formulation: we introduce an analyst-friendly problem formulation for explaining anomalies via pair plots, (b) explanation algorithm: we propose a plot-selection objective and the LookOut algorithm to approximate it with optimality guarantees, (c) generality: our explanation algorithm is both domain- and detector-agnostic, and (d) scalability: we show that LookOut scales linearly on the number of edges of the input graph. Our experiments show that LookOut performs near-ideally in terms of maximizing explanation objective on several real datasets including Enron e-mail and DBLP coauthorship. Furthermore, LookOut produces fast, visually interpretable and intuitive results in explaining ground-truth anomalies from Enron, DBLP and LBNL (computer network) data. |
Year | Venue | Field |
---|---|---|
2017 | arXiv: Social and Information Networks | Graph,Data mining,Feature vector,Computer science,Theoretical computer science,Detector,Generality,Scalability,Anomalous behavior |
DocType | Volume | Citations |
Journal | abs/1710.05333 | 0 |
PageRank | References | Authors |
0.34 | 14 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nikhil Gupta | 1 | 0 | 1.69 |
Dhivya Eswaran | 2 | 27 | 4.27 |
Neil Shah | 3 | 0 | 2.37 |
Leman Akoglu | 4 | 1498 | 71.55 |
Christos Faloutsos | 5 | 27972 | 4490.38 |