Title
On the Impact of Data Distribution in Federated SPARQL Queries
Abstract
With the growing number of publicly available SPARQL endpoints, federated queries become more and more attractive and feasible. Compared to queries against a single endpoint, queries that range over a number of endpoints pose new challenges, ranging from the type and number of datasets involved to the data distribution across the datasets. Existing research focuses on the data distribution in a central store and is mainly concerned with adopting well-known, traditional database techniques. In this work we investigate the impact of the data distribution in the context of federated SPARQL queries.We perform a number of experiments with four federation frameworks (Sesame Alibaba, Splendid, FedX, and Darq) against an RDF dataset, Dailymed, that we partition by graph and class.Our preliminary results confirm the intuition that the more datasets involved in query processing, the worse performance of federation query is and that the data distribution significantly influences the performance.
Year
DOI
Venue
2012
10.1109/ICSC.2012.72
Semantic Computing
Keywords
Field
DocType
federated sparql queries,data distribution,central storeand,worse performanceof federation query,sesame alibaba,federated sparql query,federated query,endpoints posenew challenge,availablesparql endpoint,query processing,thedata distribution,coherence,distributed databases,database management systems,data handling,query languages,graph theory,resource description framework
Query optimization,Data mining,Web search query,Query language,RDF query language,Information retrieval,Computer science,Web query classification,SPARQL,Spatial query,Online aggregation
Conference
ISBN
Citations 
PageRank 
978-1-4673-4433-3
5
0.44
References 
Authors
10
2
Name
Order
Citations
PageRank
Nur Aini Rakhmawati1242.52
Michael Hausenblas247852.35