Title
D-REPR - A Language for Describing and Mapping Diversely-Structured Data Sources to RDF.
Abstract
Publishing data sources to knowledge graphs is a complicated and laborious process as data sources are often heterogeneous, hierarchical and interlinked. As an example, food price datasets may contain product prices of various units at different markets and times, and different providers can have many choices of formats such as CSV, JSON or spreadsheet. Beyond data formats, these datasets may have differing layout, where one dataset may be organized as a row-based table or relational table (prices are in one column), while another may use a matrix table (prices are in one matrix). To address these problems, we present a novel data description language for mapping datasets to RDF. In particular, our language supports specifying the locations of source attributes in the sources, mapping of the attributes to ontologies, and simple rules to join the data of these attributes to output final RDF triples. Unlike existing approaches, our language is not restricted to specific data layouts such as the Nested Relational Model, or to specific data formats, such as spreadsheet. Our broad data description language presents a format-independent solution, allowing interlinking among multiple heterogeneous sources and representing many diverse data structures that existing tools are unable to handle.
Year
DOI
Venue
2019
10.1145/3360901.3364449
K-CAP
Keywords
Field
DocType
RDF mapping, Linked Data, Knowledge Graph
Ontology (information science),Data structure,Knowledge graph,Information retrieval,Computer science,Linked data,Relational model,Data model,JSON,RDF
Conference
ISBN
Citations 
PageRank 
978-1-4503-7008-0
2
0.40
References 
Authors
2
3
Name
Order
Citations
PageRank
Binh Vu144.50
Jay Pujara28614.81
Craig A. Knoblock35229680.57