Mining top-K large structural patterns in a massive network - Citegraph

Paper Info

Title
Mining top-K large structural patterns in a massive network

Abstract
AbstractWith ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot offer the efficiency desirable for large pattern discovery. We propose Spider-Mine, a novel algorithm to efficiently mine top-K largest frequent patterns from a single massive network with any user-specified probability of 1 - ∈. Deviating from the existing edge-by-edge (i.e., incremental) pattern-growth framework, SpiderMine achieves its efficiency by unleashing the power of small patterns of a bounded diameter, which we call "spiders". With the spider structure, our approach adopts a probabilistic mining framework to find the top-k largest patterns by (i) identifying an affordable set of promising growth paths toward large patterns, (ii) generating large patterns with much lower combinatorial complexity, and finally (iii) greatly reducing the cost of graph isomorphism tests with a new graph pattern representation by a multi-set of spiders. Extensive experimental studies on both synthetic and real data sets show that our algorithm outperforms existing methods.

Year	DOI	Venue
2011	10.14778/3402707.3402720	Hosted Content
Keywords	Field	DocType
information system,scientific computing,graph isomorphism,numerical analysis,social network	Data mining,Graph,Data set,Social network,Graph isomorphism,Computer science,Combinatorial complexity,Probabilistic logic,Database,Bounded function	Journal
Volume	Issue	ISSN
4	11	2150-8097
Citations	PageRank	References
35	1.12	25
Authors
6

Authors (6 rows)

Cited by (35 rows)

References (25 rows)

Name	Order	Citations	PageRank
Feida Zhu	1	1212	67.23
qiang qu	2	83	12.15
David Lo	3	5346	259.67
Xifeng Yan	4	6633	280.06
Jiawei Han	5	43085	3824.48
Philip S. Yu	6	30670	3474.16

1