Title
Functional and embedded dependency inference: a data mining point of view
Abstract
The issue of discovering functional dependencies from populated databases has received a great deal of attention because it is a key concern in database analysis. Such a capability is strongly required in database administration and design while being of great interest in other application fields such as query folding. Investigated for long years, the issue has been recently addressed in a novel and more efficient way by applying principles of data mining algorithms. The two algorithms fitting in such a trend are T ANE and Dep-Miner. They strongly improve previous proposals. In this paper, we propose a new approach adopting a data mining point of view. We define a novel characterization of minimal functional dependencies. This formal framework is sound and simpler than related work. We introduce the new concept of free set for capturing source of functional dependencies. By using the concepts of closure and quasi-closure of attribute sets, targets of such dependencies are characterized. Our approach is enforced through the algorithm F UN which is particularly efficient since it is comparable or improves the two best operational solutions (according to our knowledge): T ANE and Dep-Miner. It makes use of various optimization techniques and it can work on very large databases. Applying on real life or synthetic data more or less correlated, comparative experiments are performed in order to assess performance of F UN against T ANE and Dep-Miner. Moreover, our approach also exhibits (without significant additional execution time) embedded functional dependencies, i.e. dependencies captured in any subset of the attribute set originally considered. Embedded dependencies capture a knowledge specially relevant in all fields where materialized data sets are managed (e.g. materialized views widely used in data warehouses).
Year
DOI
Venue
2001
10.1016/S0306-4379(01)00032-1
Inf. Syst.
Keywords
Field
DocType
data mining,functional dependency,algorithms,embedded dependency inference,data mining point,database design,lattices,very large database,data warehouse,materialized views,dep,synthetic data
Data warehouse,Data mining,Computer science,Inference,Database design,Functional dependency,Synthetic data,Database administrator,Materialized view,Database,Dependency theory (database theory)
Journal
Volume
Issue
ISSN
26
7
Information Systems
Citations 
PageRank 
References 
34
1.71
36
Authors
2
Name
Order
Citations
PageRank
Noel Novelli112737.10
Rosine Cicchetti2453175.14