Title
Generation of test databases using sampling methods
Abstract
Populating the testing environment with relevant data represents a great challenge in software validation, generally requiring expert knowledge about the system under development, as its data critically impacts the outcome of the tests designed to assess the system. Current practices of populating the testing environments generally focus on developing efficient algorithms for generating synthetic data or use the production environment for testing purposes. The latter is an invaluable strategy to provide real test cases in order to discover issues that critically impact the user of the system. However, the production environment generally consists of large amounts of data that are difficult to handle and analyze. Database sampling from the production environment is a potential solution to overcome these challenges. In this research, we propose two database sampling methods, VFDS and CoDS, with the objective of populating the testing environment. The first method is a very fast random sampling approach, while the latter aims at preserving the distribution of data in order to produce a representative sample. In particular, we focus on the dependencies between the data from different tables and the method tries to preserve the distributions of these dependencies.
Year
DOI
Venue
2013
10.1145/2483760.2492397
ISSTA
Keywords
Field
DocType
random sampling approach,relevant data,database sampling,latter aim,efficient algorithm,current practice,sampling method,production environment,synthetic data,different table,testing environment,test databases,relational database,testing
Data mining,Relational database,Computer science,Development environment,Database testing,Synthetic data,Sampling (statistics),Test case,Software verification and validation,Database
Conference
Citations 
PageRank 
References 
0
0.34
11
Authors
1
Name
Order
Citations
PageRank
Teodora Sandra Buda1267.50