Abstract | ||
---|---|---|
Schema matching is a central challenge for data integration systems. Automated tools are often uncertain about schema matchings they suggest, and this uncertainty is inherent since it arises from the inability of the schema to fully capture the semantics of the represented data. Human common sense can often help. Inspired by the popularity and the success of easily accessible crowdsourcing platforms, we explore the use of crowdsourcing to reduce the uncertainty of schema matching. Since it is typical to ask simple questions on crowdsourcing platforms, we assume that each question, namely Correspondence Correctness Question (CCQ), is to ask the crowd to decide whether a given correspondence should exist in the correct matching. We propose frameworks and efficient algorithms to dynamically manage the CCQs, in order to maximize the uncertainty reduction within a limited budget of questions. We develop two novel approaches, namely "Single CCQ" and "Multiple CCQ", which adaptively select, publish and manage the questions. We verified the value of our solutions with simulation and real implementation. |
Year | DOI | Venue |
---|---|---|
2013 | 10.14778/2536360.2536374 | PVLDB |
Keywords | Field | DocType |
correct matching,data integration system,schema matchings,automated tool,crowdsourcing platform,schema matching,single ccq,uncertainty reduction,multiple ccq,accessible crowdsourcing platform | Data integration,Data mining,Ask price,Computer science,Crowdsourcing,Correctness,Schema matching,Schema (psychology),Uncertainty reduction theory,Database,Semantics | Journal |
Volume | Issue | ISSN |
6 | 9 | 2150-8097 |
Citations | PageRank | References |
43 | 1.35 | 22 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chen Jason Zhang | 1 | 161 | 8.28 |
Lei Chen | 2 | 6239 | 395.84 |
H. V. Jagadish | 3 | 11141 | 2495.67 |
Caleb Chen Cao | 4 | 292 | 12.15 |