Title
Near-Optimal and Practical Algorithms for Graph Scan Statistics with Connectivity Constraints
Abstract
One fundamental task in network analysis is detecting “hotspots” or “anomalies” in the network; that is, detecting subgraphs where there is significantly more activity than one would expect given historical data or some baseline process. Scan statistics is one popular approach used for anomalous subgraph detection. This methodology involves maximizing a score function over all connected subgraphs, which is a challenging computational problem. A number of heuristics have been proposed for these problems, but they do not provide any quality guarantees. Here, we propose a framework for designing algorithms for optimizing a large class of scan statistics for networks, subject to connectivity constraints. Our algorithms run in time that scales linearly on the size of the graph and depends on a parameter we call the “effective solution size,” while providing rigorous approximation guarantees. In contrast, most prior methods have super-linear running times in terms of graph size. Extensive empirical evidence demonstrates the effectiveness and efficiency of our proposed algorithms in comparison with state-of-the-art methods. Our approach improves on the performance relative to all prior methods, giving up to over 25% increase in the score. Further, our algorithms scale to networks with up to a million nodes, which is 1--2 orders of magnitude larger than all prior applications.
Year
DOI
Venue
2019
10.1145/3309712
ACM Transactions on Knowledge Discovery From Data
Keywords
Field
DocType
Scan statistics, anomalous subgraph detection, graph anomaly detection, parameterized complexity
Graph,Data mining,Computer science
Journal
Volume
Issue
ISSN
13
2
1556-4681
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Jose Cadena1987.53
Feng Chen245148.47
Anil Kumar S. Vullikanti3113598.30