Title
USHER: Improving data quality with dynamic forms
Abstract
Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and data quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the data entry process to improve data quality. Before entry, it induces a form layout that captures the most important data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world data sets. Our results demonstrate that each component has the potential to improve data quality considerably, at a reduced cost when compared to current practice.
Year
DOI
Venue
2010
10.1109/ICDE.2010.5447832
IEEE Transactions on Knowledge and Data Engineering
Keywords
DocType
Volume
data quality improvement,form layout,improving data quality,entry time,data-entry form,previous form submission,real-world data sets,data quality,form instance,automatic methods,modern databases,usher,data mining,important data value,real-world data set,data quality assurance,form design,end-to-end system,dynamic forms,probabilistic model,computer science,feedback,probabilistic logic,data models,error correction,artificial intelligence,predictive models,quality assurance,bayesian methods,databases,real time
Conference
23
Issue
ISSN
ISBN
8
1084-4627
978-1-4244-5444-0
Citations 
PageRank 
References 
20
2.19
13
Authors
5
Name
Order
Citations
PageRank
Kuang Chen123618.24
Harr Chen252228.20
Neil Conway345821.46
Joseph M. Hellerstein4140931651.14
Tapan S. Parikh592685.91