Title
Learning conflict resolution strategies for cross-language Wikipedia data fusion
Abstract
In order to efficiently use the ever growing amounts of structured data on the web, methods and tools for quality-aware data integration should be devised. In this paper we propose an approach to automatically learn the conflict resolution strategies, which is a crucial step in large-scale data integration. The approach is implemented as an extension of the Sieve data quality assessment and fusion framework. We apply and evaluate our approach on the use case of fusing data from 10 language editions of DBpedia, a large-scale structured knowledge base extracted from Wikipedia. We also propose a method for extracting rich provenance metadata for each DBpedia fact, which is later used in data fusion.
Year
DOI
Venue
2014
10.1145/2567948.2578999
WWW (Companion Volume)
Keywords
Field
DocType
sieve data quality assessment,fusion framework,data fusion,large-scale structured knowledge base,large-scale data integration,use case,fusing data,dbpedia fact,structured data,cross-language wikipedia data fusion,conflict resolution strategy,quality-aware data integration,conflict resolution
Data integration,Conflict resolution strategy,Metadata,Data mining,World Wide Web,Data quality,Information retrieval,Computer science,Conflict resolution,Sensor fusion,Knowledge base,Data model
Conference
Citations 
PageRank 
References 
13
0.66
6
Authors
2
Name
Order
Citations
PageRank
Volha Bryl118014.46
Christian Bizer28448524.93