Title
Distributed Processing Of Regular Path Queries In Rdf Graphs
Abstract
SPARQL 1.1 offers a type of navigational query for RDF systems, called regular path query (RPQ). A regular path query allows for retrieving node pairs with the paths between them satisfying regular expressions. Regular path queries are always difficult to be evaluated efficiently because of the possible large search space. Thus there has been no scalable and practical solution so far. In this paper, we present Leon+, an in-memory distributed framework, to address the RPQ problem in the context of the knowledge graph. To reduce search space and mitigate mounting communication costs, Leon+ takes advantage of join-ahead pruning via a novel RDF summarization technique together with a path partitioning strategy. We also develop a subtle cost model to devise query plans to achieve high efficiency for complex RPQs. As there has been no available RPQ benchmark, we create micro-benchmarks on both synthetic and real-world datasets. A thorough experimental evaluation is presented between our approach and the state-of-the-art RDF stores. The results show that our approach outperforms 5x faster than the competitors on single RPQ. For query workload, it saves up to 1/2 time and 2/3 communication overheads over the baseline method.
Year
DOI
Venue
2021
10.1007/s10115-020-01536-2
KNOWLEDGE AND INFORMATION SYSTEMS
Keywords
DocType
Volume
Knowledge graph, RDF/SPARQL, Regular path queries, Graph summarization, Graph partitioning
Journal
63
Issue
ISSN
Citations 
4
0219-1377
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Xintong Guo100.68
Hong Gao21086120.07
Zhaonian Zou333115.78