Title
Mining Frequent Subgraph Patterns from Uncertain Graph Data
Abstract
In many real applications, graph data is subject to uncertainties due to incompleteness and imprecision of data. Mining such uncertain graph data is semantically different from and computationally more challenging than mining conventional exact graph data. This paper investigates the problem of mining uncertain graph data and especially focuses on mining frequent subgraph patterns on an uncertain graph database. A novel model of uncertain graphs is presented, and the frequent subgraph pattern mining problem is formalized by introducing a new measure, called expected support. This problem is proved to be NP-hard. An approximate mining algorithm is proposed to find a set of approximately frequent subgraph patterns by allowing an error tolerance on expected supports of discovered subgraph patterns. The algorithm uses efficient methods to determine whether a subgraph pattern can be output or not and a new pruning method to reduce the complexity of examining subgraph patterns. Analytical and experimental results show that the algorithm is very efficient, accurate, and scalable for large uncertain graph databases. To the best of our knowledge, this paper is the first one to investigate the problem of mining frequent subgraph patterns from uncertain graph data.
Year
DOI
Venue
2010
10.1109/TKDE.2010.80
IEEE Trans. Knowl. Data Eng.
Keywords
Field
DocType
uncertain graph,frequent subgraph pattern mining,mining frequent subgraph patterns,frequent subgraph pattern mining problem,approximation theory,graph mining,algorithm.,np-hard problem,uncertain graph database,approximate mining algorithm,error tolerance,expected support measurement,computational complexity,pruning method,subgraph pattern,conventional exact graph data,uncertain graph data,large uncertain graph databases,data mining,frequent subgraph pattern,graph data,network topology,data model,proteins,algorithm design and analysis,np hard problem,uncertainty,bioinformatics,predictive models,depth first search,algorithm,dna,search space,wireless networks,databases
Data mining,Computer science,Induced subgraph isomorphism problem,Theoretical computer science,Artificial intelligence,Subgraph isomorphism problem,Graph theory,Graph database,Planarity testing,Maximum common subgraph isomorphism problem,Molecule mining,Machine learning,Graph (abstract data type)
Journal
Volume
Issue
ISSN
22
9
1041-4347
Citations 
PageRank 
References 
84
2.34
43
Authors
4
Name
Order
Citations
PageRank
Zhaonian Zou133115.78
Jianzhong Li23196304.46
Hong Gao31086120.07
Shuo Zhang42119.44