Abstract | ||
---|---|---|
Computing the functional dependencies that hold on a given data set is one of the most important problems in data profiling. Utilizing new data structures and original techniques for the dynamic computation of stripped partitions, we devise a new hybridization strategy that outperforms the best algorithms in terms of efficiency, column-, and row-scalability. This is demonstrated on real-world benchmark data. We further propose the number of redundant data values for ranking the output of discovery algorithms. Our ranking assesses the relevance of functional dependencies for the given data set. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICDE.2019.00137 | 2019 IEEE 35th International Conference on Data Engineering (ICDE) |
Keywords | Field | DocType |
Heuristic algorithms,Partitioning algorithms,Redundancy,Switches,Lattices,Computer science,Data structures | Data structure,Data mining,Ranking,Relational database,Computer science,Functional dependency,Redundancy (engineering),Data profiling,Missing data,Computation | Conference |
ISSN | ISBN | Citations |
1084-4627 | 978-1-5386-7474-1 | 2 |
PageRank | References | Authors |
0.37 | 0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ziheng Wei | 1 | 8 | 6.92 |
Sebastian Link | 2 | 462 | 39.59 |