Abstract | ||
---|---|---|
ABSTRACTA variety of tasks on dynamic graphs, including anomaly detection, community detection, compression, and graph understanding, have been formulated as problems of identifying constituent (near) bi-cliques (i.e., complete bipartite graphs). Even when we restrict our attention to maximal ones, there can be exponentially many near bi-cliques, and thus finding all of them is practically impossible for large graphs. Then, two questions naturally arise: (Q1) What is a ''good'' set of near bi-cliques? That is, given a set of near bi-cliques in the input dynamic graph, how should we evaluate its quality? (Q2) Given a large dynamic graph, how can we rapidly identify a high-quality set of near bi-cliques in it? Regarding Q1, we measure how concisely, precisely, and exhaustively a given set of near bi-cliques describes the input dynamic graph. We combine these three perspectives systematically on the Minimum Description Length principle. Regarding Q2, we propose CutNPeel, a fast search algorithm for a high-quality set of near bi-cliques. By adaptively re-partitioning the input graph, CutNPeel reduces the search space and at the same time improves the search quality. Our experiments using six real-world dynamic graphs demonstrate that CutNPeel is (a) High-quality: providing near bi-cliques of up to 51.2% better quality than its state-of-the-art competitors, (b) Fast: up to 68.8X faster than the next-best competitor, and (c) Scalable: scaling to graphs with 134 million edges. We also show successful applications of CutNPeel to graph compression and pattern discovery. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3488560.3498390 | WSDM |
Keywords | DocType | Citations |
Bi-clique, Dynamic Graph, Graph Compression, Pattern Discovery | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hyeonjeong Shin | 1 | 0 | 0.34 |
Taehyung Kwon | 2 | 0 | 0.34 |
Neil Shah | 3 | 323 | 24.15 |
Kijung Shin | 4 | 0 | 2.03 |