Title
Approximate matching over biological RDF graphs
Abstract
In the last few years, the amount of biological interaction data discovered and stored in public databases (e.g., KEGG [2]) considerably increased. To this aim, RDF is a powerful representation for interactions (or pathways), since they can be modeled as directed graphs, often referred to as biological networks, where nodes represent cellular components and the (labeled or unlabeled) edges correspond to interactions among components. Often for a given organism some components are known to be linked by well studied interactions. Such groups of components are called modules and they can be represented by sub-graphs in the corresponding biological network model. At today, one of the most important problems for biologists is that of querying the interaction dataset of an organism by a specific module exploited as the query. The aim is to discover if such a module is contained in the input interaction dataset. In this scenario biological variations (e.g. insertions and/or deletions of both nodes and edges) due to the evolution need to be considered in the search process, thus that approximate matching is more effective than exact matching. Typically the problem is reconducted to the search of all the subgraphs in the biological network (i.e. the input graph database) that are isomorphic to the given module (i.e. the query graph). Since such a test is known to be an NP-hard problem, many proposals introduce heuristics (i.e. similarity or distance measurements) and particular indexing structures to reduce the overall complexity (e.g., [3, 4]).
Year
DOI
Venue
2012
10.1145/2245276.2232000
SAC
Keywords
Field
DocType
biological rdf graph,biological network,input interaction dataset,corresponding biological network model,approximate matching,interaction dataset,np-hard problem,biological interaction data,specific module,scenario biological variation,exact matching,np hard problem,computational biology,directed graph,stochastic simulation
Stochastic simulation,Graph database,Computer science,Biological network,Search engine indexing,KEGG,Theoretical computer science,Heuristics,Isomorphism,RDF
Conference
Citations 
PageRank 
References 
2
0.37
3
Authors
2
Name
Order
Citations
PageRank
Roberto De Virgilio120227.46
Simona E. Rombo219222.21