Title
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks?
Abstract
Triplestores are data management systems for storing and querying RDF data. Over recent years, various benchmarks have been proposed to assess the performance of triplestores across different performance measures. However, choosing the most suitable benchmark for evaluating triplestores in practical settings is not a trivial task. This is because triplestores experience varying workloads when deployed in real applications. We address the problem of determining an appropriate benchmark for a given real-life workload by providing a fine-grained comparative analysis of existing triplestore benchmarks. In particular, we analyze the data and queries provided with the existing triplestore benchmarks in addition to several real-world datasets. Furthermore, we measure the correlation between the query execution time and various SPARQL query features and rank those features based on their significance levels. Our experiments reveal several interesting insights about the design of such benchmarks. With this fine-grained evaluation, we aim to support the design and implementation of more diverse benchmarks. Application developers can use our result to analyze their data and queries and choose a data management system.
Year
DOI
Venue
2019
10.1145/3308558.3313556
WWW '19: The Web Conference on The World Wide Web Conference WWW 2019
Field
DocType
ISBN
Data mining,Information retrieval,Workload,Crowdsourcing,Computer science,Triplestore,SPARQL,Feature engineering,Execution time,Data management,RDF
Conference
978-1-4503-6674-8
Citations 
PageRank 
References 
1
0.35
0
Authors
6
Name
Order
Citations
PageRank
Muhammad Saleem119421.78
Gábor Szárnyas2537.84
Felix Conrads310.68
Syed Ahmad Chan Bukhari4338.07
Qaiser Mehmood59111.68
Axel-Cyrille Ngonga Ngomo61775139.40