Title
Parallel and scalable processing of spatio-temporal RDF queries using Spark
Abstract
The ever-increasing size of data emanating from mobile devices and sensors, dictates the use of distributed systems for storing and querying these data. Typically, such data sources provide some spatio-temporal information, alongside other useful data. The RDF data model can be used to interlink and exchange data originating from heterogeneous sources in a uniform manner. For example, consider the case where vessels report their spatio-temporal position, on a regular basis, by using various surveillance systems. In this scenario, a user might be interested to know which vessels were moving in a specific area for a given temporal range. In this paper, we address the problem of efficiently storing and querying spatio-temporal RDF data in parallel. We specifically study the case of SPARQL queries with spatio-temporal constraints, by proposing the DiStRDF system, which is comprised of a Storage and a Processing Layer. The DiStRDF Storage Layer is responsible for efficiently storing large amount of historical spatio-temporal RDF data of moving objects. On top of it, we devise our DiStRDF Processing Layer, which parses a SPARQL query and produces corresponding logical and physical execution plans. We use Spark, a well-known distributed in-memory processing framework, as the underlying processing engine. Our experimental evaluation, on real data from both aviation and maritime domains, demonstrates the efficiency of our DiStRDF system, when using various spatio-temporal range constraints.
Year
DOI
Venue
2021
10.1007/s10707-019-00371-0
GEOINFORMATICA
Keywords
DocType
Volume
Distributed query processing, Distributed spatio-temporal queries, SPARQL queries
Journal
25
Issue
ISSN
Citations 
4
1384-6175
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Panagiotis Nikitopoulos143.49
Akrivi Vlachou275139.95
Christos Doulkeridis389955.91
George A. Vouros481987.44