Title
Towards realistic sampling: generating dependencies in a relational database
Abstract
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial activities and it usually requires expert knowledge. In a wide range of application areas, such as data mining, histogram construction, approximate query evaluation, and software validation, handling exponentially growing databases has become a difficult challenge, and a subset of the data is generally preferred. As a solution to the current challenges in managing large amounts of data, database sampling from the operational data available has proved to be a powerful technique. However, none of the existing sampling approaches consider the dependencies between the data in a relational database. In this paper, we propose a novel approach towards constructing a realistic testing environment, by analyzing the distribution of data in the original database along these dependencies before sampling, so that the sample database is representative to the original database.
Year
DOI
Venue
2013
10.1145/2448556.2448568
ICUIMC
Keywords
Field
DocType
data mining,current challenge,approximate query evaluation,relational database,towards realistic sampling,sample database,original database,large amount,application area,existing sampling approach,operational data,test,measurement,algorithms
Data mining,Database tuning,Relational database,Database model,Information retrieval,Computer science,View,Database testing,Database design,Database schema,Database theory
Conference
Citations 
PageRank 
References 
0
0.34
11
Authors
3
Name
Order
Citations
PageRank
Teodora Sandra Buda1267.50
John Murphy27510.07
Morten Kristiansen3112.67