Abstract | ||
---|---|---|
We propose the Multiple Join Path (MJP) framework for obtaining high quality information by linking fields across multiple databases, when the underlying databases have poor quality data, which are characterized by violations of integrity constraints like keys and functional dependencies within and across databases. MJP asso- ciates quality scores with candidate answers by first scoring indi- vidual data paths between a pair of field values taking into account data quality with respect to specified integrity constraints, and then agglomerating scores across multiple data paths that serve as cor- roborating evidences for a candidate answer. We address the prob- lem of finding the top-few (highest quality) answers in the MJP framework using novel techniques, and demonstrate the utility of our techniques using real data and our Virtual Integration Proto- type testbed. |
Year | Venue | Keywords |
---|---|---|
2006 | CleanDB | data quality,integrity constraints,functional dependency |
Field | DocType | Citations |
Data mining,Multiple data,Data quality,Testbed,Functional dependency,Data integrity,Virtual integration,Mathematics | Conference | 8 |
PageRank | References | Authors |
0.83 | 11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yannis Kotidis | 1 | 1994 | 208.82 |
Amélie Marian | 2 | 1280 | 77.92 |
Divesh Srivastava | 3 | 8984 | 1191.22 |