Title
Integrating open sources and relational data with SPARQL
Abstract
We believe that the possibility to use SPARQL as a front end to heterogeneous data without significant cost in performance or expressive power is key to RDF taking its rightful place as the lingua franca of data integration. To this effect, we demonstrate how RDF and SPARQL can tackle a mix of standard relational workload and data mining in public data sources. We discuss extending SPARQL for business intelligence (BI) workloads and relate experiences on running SPARQL against relational and native RDF databases. We use the well known TPC H benchmark as our reference schema and workload. We define a mapping of the TPC H schema to RDF and restate the queries as BI extended SPARQL. To this effect, we define aggregation and nested queries for SPARQL. We demonstrate that it is possible to perform the TPC H workload restated in SPARQL against an existing RDBMS without loss of performance or expressivity and without changes to the RDBMS. Finally, we demonstrate how to combine TPC-H or XBRL financial reports with RDF data from CIA factbook and DBpedia.
Year
DOI
Venue
2008
10.1007/978-3-540-68234-9_69
ESWC
Keywords
Field
DocType
native rdf databases,data mining,relational data,rdf data,standard relational workload,heterogeneous data,data integration,tpc h schema,tpc h benchmark,tpc h workload,open source,public data source,data integrity,front end,business intelligence,expressive power
Data integration,Data mining,Relational database,Information retrieval,Computer science,SPARQL,Relational database management system,XBRL,Named graph,RDF Schema,Database,RDF
Conference
Volume
ISSN
ISBN
5021
0302-9743
3-540-68233-3
Citations 
PageRank 
References 
1
0.38
2
Authors
2
Name
Order
Citations
PageRank
Orri Erling148932.75
Ivan Mikhailov212413.77