Abstract | ||
---|---|---|
Embedded functional dependencies (eFDs) advance data management applications by data completeness and integrity requirements. We show that the discovery problem of eFDs is NP-complete, W[2]-complete in the output, and has a minimum solution space that is larger than the maximum solution space for functional dependencies. Nevertheless, we use novel data structures and search strategies to develop row-efficient, column-efficient, and hybrid algorithms for eFD discovery. Our experiments demonstrate that the algorithms scale well in terms of their design targets, and that ranking the eFDs by the number of redundant data values they cause can provide useful guidance in identifying meaningful eFDs for applications. Finally, we demonstrate the benefits of introducing completeness requirements and ranking by the number of redundant data values for approximate and genuine functional dependencies.
|
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3318464.3389786 | SIGMOD/PODS '20: International Conference on Management of Data
Portland
OR
USA
June, 2020 |
Keywords | DocType | ISBN |
Discovery, Embedded functional dependency, Missing data | Conference | 978-1-4503-6735-6 |
Citations | PageRank | References |
1 | 0.35 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ziheng Wei | 1 | 8 | 6.92 |
Sven Hartmann | 2 | 409 | 42.86 |
Sebastian Link | 3 | 462 | 39.59 |