Title
Compression-Based Graph Mining Exploiting Structure Primitives
Abstract
How can we retrieve information from sparse graphs? Traditional graph mining approaches focus on discovering dense patterns inside complex networks, for example modularity-based or cut-based methods. However, most real world data sets are very sparse. Nevertheless, traditional approaches tend to omit interesting sparse patterns like stars. In this paper, we propose a novel graph mining technique modeling the transitivity and the hub ness of a graph using structure primitives. We exploit these structure primitives for effective graph compression using the Minimum Description Length Principle. The compression rate is an unbiased measure for the transitivity or hub ness and therefore provides interesting insights into the structure of even very sparse graphs. Since real graphs can be composed of sub graphs of different structures, we propose a novel algorithm CXprime (Compression-based exploiting Primitives) for clustering graphs using our coding scheme as an objective function. In contrast to traditional graph clustering methods, our algorithm automatically recognizes different types of sub graphs without requiring the user to specify input parameters. Additionally we propose a novel link prediction algorithm based on the detected substructures, which increases the quality of former methods. Extensive experiments evaluate our algorithms on synthetic and real data.
Year
DOI
Venue
2013
10.1109/ICDM.2013.56
ICDM
Keywords
Field
DocType
minimum description length principle,example modularity-based method,cut-based methods,pattern clustering,dense pattern discovery,cxprime algorithm,sparse graphs,compression,star parse patterns,graph mining,compression-based graph mining technique,coding scheme,objective function,information retrieval,link prediction,data compression,minimum description length,graph clustering methods,partition,data mining,graph theory,compression rate,structure primitives,link prediction algorithm
Data mining,Modular decomposition,Graph property,Computer science,Theoretical computer science,Artificial intelligence,Clustering coefficient,Graph theory,Graph operations,Graph product,Clique-width,Graph (abstract data type),Machine learning
Conference
ISSN
Citations 
PageRank 
1550-4786
2
0.37
References 
Authors
4
5
Name
Order
Citations
PageRank
Jing Feng140.74
Xiao He2644.65
Nina Hubig3134.64
Christian Böhm42494528.46
Claudia Plant553654.69