Abstract | ||
---|---|---|
In order to efficiently use the ever growing amounts of structured data on the web, methods and tools for quality-aware data integration should be devised. In this paper we propose an approach to automatically learn the conflict resolution strategies, which is a crucial step in large-scale data integration. The approach is implemented as an extension of the Sieve data quality assessment and fusion framework. We apply and evaluate our approach on the use case of fusing data from 10 language editions of DBpedia, a large-scale structured knowledge base extracted from Wikipedia. We also propose a method for extracting rich provenance metadata for each DBpedia fact, which is later used in data fusion. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1145/2567948.2578999 | WWW (Companion Volume) |
Keywords | Field | DocType |
sieve data quality assessment,fusion framework,data fusion,large-scale structured knowledge base,large-scale data integration,use case,fusing data,dbpedia fact,structured data,cross-language wikipedia data fusion,conflict resolution strategy,quality-aware data integration,conflict resolution | Data integration,Conflict resolution strategy,Metadata,Data mining,World Wide Web,Data quality,Information retrieval,Computer science,Conflict resolution,Sensor fusion,Knowledge base,Data model | Conference |
Citations | PageRank | References |
13 | 0.66 | 6 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Volha Bryl | 1 | 180 | 14.46 |
Christian Bizer | 2 | 8448 | 524.93 |