Title
Local Spectral Clustering for Overlapping Community Detection.
Abstract
Large graphs arise in a number of contexts and understanding their structure and extracting information from them is an important research area. Early algorithms for mining communities have focused on global graph structure, and often run in time proportional to the size of the entire graph. As we explore networks with millions of vertices and find communities of size in the hundreds, it becomes important to shift our attention from macroscopic structure to microscopic structure in large networks. A growing body of work has been adopting local expansion methods in order to identify communities from a few exemplary seed members. In this article, we propose a novel approach for finding overlapping communities called Lemon (Local Expansion via Minimum One Norm). Provided with a few known seeds, the algorithm finds the community by performing a local spectral diffusion. The core idea of Lemon is to use short random walks to approximate an invariant subspace near a seed set, which we refer to as local spectra. Local spectra can be viewed as the low-dimensional embedding that captures the nodes’ closeness in the local network structure. We show that Lemon’s performance in detecting communities is competitive with state-of-the-art methods. Moreover, the running time scales with the size of the community rather than that of the entire graph. The algorithm is easy to implement and is highly parallelizable. We further provide theoretical analysis of the local spectral properties, bounding the measure of tightness of extracted community using the eigenvalues of graph Laplacian. We thoroughly evaluate our approach using both synthetic and real-world datasets across different domains, and analyze the empirical variations when applying our method to inherently different networks in practice. In addition, the heuristics on how the seed set quality and quantity would affect the performance are provided.
Year
DOI
Venue
2018
10.1145/3106370
TKDD
Keywords
Field
DocType
Community detection, graph diffusion, local spectral clustering, random walk, seed set expansion
Laplacian matrix,Data mining,Spectral clustering,Embedding,Computer science,Closeness,Random walk,Invariant subspace,Heuristics,Bounding overwatch
Journal
Volume
Issue
ISSN
12
2
1556-4681
Citations 
PageRank 
References 
10
0.54
22
Authors
5
Name
Order
Citations
PageRank
Yixuan Li11709.46
Kun He230542.88
Kyle Kloster3795.72
David Bindel442729.24
John Hopcroft542451836.70