Using Graph Summarization for Join-Ahead Pruning in a Distributed RDF Engine - Citegraph

Paper Info

Title
Using Graph Summarization for Join-Ahead Pruning in a Distributed RDF Engine

Abstract
The need for scalable and efficient RDF stores has seen a high demand recently. Many efficient systems, both centralized and distributed, have been proposed. Since a row-oriented output is required by SPARQL, most of the current systems rely on relational joins. One of the problems with relational joins, though, is a performance bottleneck imposed by the generation of large intermediate relations which could be avoided by using more accurate data and pruning statistics. To address this problem, recently several systems have been proposed that employ bisimulation-based graph summaries -- adopted from XML indexing -- over large RDF graphs in order to facilitate join-ahead pruning. In this paper, we discuss a different, locality-based, graph summarization approach for RDF data and highlight its utilization for join-ahead pruning in a distributed SPARQL engine. Based on our recently developed TriAD engine, we present a detailed comparison of processing techniques for these graph summaries over the synthetic LUBM benchmark.

Year	DOI	Venue
2014	10.1145/2630602.2630610	SWIM
Keywords	Field	DocType
algorithms,design,graphs and networks,experimentation,semantic networks,content analysis and indexing,measurement,world wide web,performance	Data mining,Bottleneck,Joins,Computer science,SPARQL,Bisimulation,RDF Schema,RDF,Pruning,Scalability	Conference
Citations	PageRank	References
4	0.47	14
Authors
4

Authors (4 rows)

Cited by (4 rows)

References (14 rows)

Name	Order	Citations	PageRank
Sairam Gurajada	1	118	7.83
Stephan Seufert	2	4	0.47
Iris Miliaraki	3	4	0.47
Martin Theobald	4	1474	72.06

1