Title
Explainable Similarity Of Datasets Using Knowledge Graph
Abstract
There is a large quantity of datasets available as Open Data on the Web. However, it is challenging for users to find datasets relevant to their needs, even though the datasets are registered in catalogs such as the European Data Portal. This is because the available metadata such as keywords or textual description is not descriptive enough. At the same time, datasets exist in various types of contexts not expressed in the metadata. These may include information about the dataset publisher, the legislation related to dataset publication, language and cultural specifics, etc. In this paper we introduce a similarity model for matching datasets. The model assumes an ontology/knowledge graph, such as Wikidata.org, that serves as a graph-based context to which individual datasets are mapped based on their metadata. A similarity of the datasets is then computed as an aggregation over paths among nodes in the graph. The proposed similarity aims at addressing the problem of explainability of similarity, i.e., providing the user a structured explanation of the match which, in a broader sense, is nowadays a hot topic in the field of artificial intelligence.
Year
DOI
Venue
2019
10.1007/978-3-030-32047-8_10
SIMILARITY SEARCH AND APPLICATIONS (SISAP 2019)
Keywords
Field
DocType
Similarity, Datasets, Search, Graph
Ontology,Open data,Graph,Metadata,Knowledge graph,Information retrieval,Computer science
Conference
Volume
ISSN
Citations 
11807
0302-9743
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Petr Skoda1399.56
Jakub Klímek200.34
Martin Necaský300.34
Tomás Skopal420220.95