Title
A Local Algorithm for Structure-Preserving Graph Cut
Abstract
Nowadays, large-scale graph data is being generated in a variety of real-world applications, from social networks to co-authorship networks, from protein-protein interaction networks to road traffic networks. Many existing works on graph mining focus on the vertices and edges, with the first-order Markov chain as the underlying model. They fail to explore the high-order network structures, which are of key importance in many high impact domains. For example, in bank customer personally identifiable information (PII) networks, the star structures often correspond to a set of synthetic identities; in financial transaction networks, the loop structures may indicate the existence of money laundering. In this paper, we focus on mining user-specified high-order network structures and aim to find a structure-rich subgraph which does not break many such structures by separating the subgraph from the rest. A key challenge associated with finding a structure-rich subgraph is the prohibitive computational cost. To address this problem, inspired by the family of local graph clustering algorithms for efficiently identifying a low-conductance cut without exploring the entire graph, we propose to generalize the key idea to model high-order network structures. In particular, we start with a generic definition of high-order conductance, and define the high-order diffusion core, which is based on a high-order random walk induced by user-specified high-order network structure. Then we propose a novel High-Order Structure-Preserving LOcal Cut (HOSPLOC) algorithm, which runs in polylogarithmic time with respect to the number of edges in the graph. It starts with a seed vertex and iteratively explores its neighborhood until a subgraph with a small high-order conductance is found. Furthermore, we analyze its performance in terms of both effectiveness and efficiency. The experimental results on both synthetic graphs and real graphs demonstrate the effectiveness and efficiency of our proposed HOSPLOC algorithm.
Year
DOI
Venue
2017
10.1145/3097983.3098015
KDD
Keywords
Field
DocType
Local Clustering Algorithm,High-Order Network Structure
Graph theory,Data mining,Line graph,Spatial network,Computer science,Graph factorization,Induced subgraph isomorphism problem,Distance-hereditary graph,Artificial intelligence,Clustering coefficient,Subgraph isomorphism problem,Machine learning
Conference
Citations 
PageRank 
References 
11
0.53
22
Authors
7
Name
Order
Citations
PageRank
Dawei Zhou17710.03
Si Zhang2160.93
Mehmet Yigit Yildirim3142.61
Scott Alcorn4140.92
Hanghang Tong53560202.37
H. Davulcu6414.68
Jingrui He797775.40