Title
VFDS: An Application to Generate Fast Sample Databases
Abstract
Large amounts of data often require expensive and time-consuming analysis. Therefore, highly scalable and efficient techniques are necessary to process, analyze and discover useful information. Database sampling has proven to be a powerful method to surpass these limitations. Using only a sample of the original large database brings the benefit of obtaining useful information faster, at the potential expense of lower accuracy. In this paper, we demonstrate \\vfds, a novel fast database sampling system that maintains the referential integrity of the data. The system is developed over the open-source database management system, MySQL. We present various scenarios to demonstrate the effectiveness of VFDS in approximate query answering, sample size, and execution time, on both real and synthetic databases.
Year
DOI
Venue
2014
10.1145/2661829.2661845
CIKM
Keywords
Field
DocType
database sampling,miscellaneous,random,relational database
Data mining,Database tuning,Relational database,Information retrieval,Computer science,Database testing,Database design,Sampling (statistics),Database,Sample size determination,Referential integrity,Scalability
Conference
Citations 
PageRank 
References 
0
0.34
12
Authors
4
Name
Order
Citations
PageRank
Teodora Sandra Buda1267.50
Thomas Cerqueus24510.23
John Murphy37510.07
Morten Kristiansen4112.67