Abstract | ||
---|---|---|
The issue of discovering functional dependencies from populated databases has received a great deal of attention because it is a key concern in database analysis. Such a capability is strongly required in database administration and design while being of great interest in other application fields such as query folding. Investigated for long years, the issue has been recently addressed in a novel and more efficient way by applying principles of data mining algorithms. The two algorithms fitting in such a trend are T ANE and Dep-Miner. They strongly improve previous proposals. In this paper, we propose a new approach adopting a data mining point of view. We define a novel characterization of minimal functional dependencies. This formal framework is sound and simpler than related work. We introduce the new concept of free set for capturing source of functional dependencies. By using the concepts of closure and quasi-closure of attribute sets, targets of such dependencies are characterized. Our approach is enforced through the algorithm F UN which is particularly efficient since it is comparable or improves the two best operational solutions (according to our knowledge): T ANE and Dep-Miner. It makes use of various optimization techniques and it can work on very large databases. Applying on real life or synthetic data more or less correlated, comparative experiments are performed in order to assess performance of F UN against T ANE and Dep-Miner. Moreover, our approach also exhibits (without significant additional execution time) embedded functional dependencies, i.e. dependencies captured in any subset of the attribute set originally considered. Embedded dependencies capture a knowledge specially relevant in all fields where materialized data sets are managed (e.g. materialized views widely used in data warehouses). |
Year | DOI | Venue |
---|---|---|
2001 | 10.1016/S0306-4379(01)00032-1 | Inf. Syst. |
Keywords | Field | DocType |
data mining,functional dependency,algorithms,embedded dependency inference,data mining point,database design,lattices,very large database,data warehouse,materialized views,dep,synthetic data | Data warehouse,Data mining,Computer science,Inference,Database design,Functional dependency,Synthetic data,Database administrator,Materialized view,Database,Dependency theory (database theory) | Journal |
Volume | Issue | ISSN |
26 | 7 | Information Systems |
Citations | PageRank | References |
34 | 1.71 | 36 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Noel Novelli | 1 | 127 | 37.10 |
Rosine Cicchetti | 2 | 453 | 175.14 |