Abstract | ||
---|---|---|
This paper studies a class of neighborhood con-straints to characterize and repair erroneous entity information in multi-relational graph data. (1) We propose a class of constraints called star functional dependencies (StarFDs). Unlike conventional integrity constraints, a StarFDenforces value dependencies conditioned by entities and their relevant neighbors, which are identified by a star pattern that incorporates conjunctive regular path queries. StarFDsachieve a balance between expressiveness and complexity: the validation of StarFDsis tractable, and the satisfiability and implication of StarFDsare NP-complete and coNP-complete, respectively. (2) Given a set of StarFDsΣ and a graph G, the entity repair problem is to compute a minimum repair of G by enforcing Σ with the smallest amount of changes. Although this problem is NP-complete and hard to approximate, we show it is feasible to compute repairs in large graphs. Our approach (a) discriminately detects and resolves errors with optimal, approximable and cost-bounded solutions whenever possible, and (b) incurs a time cost determined by Σ and the size of inconsistencies, for all cases. Using real world data, we show that StarFD-based techniques effectively identify and repair errors. We also show that our repairing algorithms benefit other tasks such as fact checking. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/ICDE48307.2020.00027 | 2020 IEEE 36th International Conference on Data Engineering (ICDE) |
Keywords | DocType | ISSN |
data cleaning,knowledge graphs | Conference | 1063-6382 |
ISBN | Citations | PageRank |
978-1-7281-2904-4 | 0 | 0.34 |
References | Authors | |
21 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
peng lin | 1 | 39 | 12.10 |
Qi Song | 2 | 14 | 2.91 |
Yinghui Wu | 3 | 824 | 42.79 |
Jiaxing Pi | 4 | 0 | 0.34 |