Title
Declarative generation of synthetic XML data
Abstract
Synthetic data can be extremely useful in testing and evaluating algorithms, tools and systems. Most synthetic data generators available today are the result of individual benchmarking efforts. Typicallly, these are complex programs in which the specifications of both the structure and the contents of the data are hard-coded. As a result, it is often difficult to customize these tools for producing synthetic data tailored for specific needs. In this article, we describe the ToXgene synthetic data generator, which is a declarative tool for generating realistic XML data for benchmarking as well as testing purposes. We present our template specification language, which consists of augmenting XML Schema with probabilistic models that guide the data-generation process. We discuss the architecture of our current implementation and we argue about ToXgene's usefulness by discussing experimental results as well as describing two projects that use our tool. Copyright (C) 2006 John Wiley & Sons, Ltd.
Year
DOI
Venue
2006
10.1002/spe.724
SOFTWARE-PRACTICE & EXPERIENCE
Keywords
DocType
Volume
XML,synthetic data,benchmarking,probabilistic generative models
Journal
36
Issue
ISSN
Citations 
10
0038-0644
3
PageRank 
References 
Authors
0.41
12
2
Name
Order
Citations
PageRank
Denilson Barbosa161043.52
Alberto O. Mendelzon248481394.98