Title
Completeness of queries over SQL databases
Abstract
Data completeness is an important aspect of data quality. We consider a setting, where databases can be incomplete in two ways: records may be missing and records may contain null values. We (i) formalize when the answer set of a query is complete in spite of such incompleteness, and (ii) we introduce table completeness statements, by which one can express that certain parts of a database are complete. We then study how to deduce from a set of table-completeness statements that a query can be answered completely. Null values as used in SQL are ambiguous. They can indicate either that no attribute value exists or that a value exists, but is unknown. We study completeness reasoning for the different interpretations. We show that in the combined case it is necessary to syntactically distinguish between different kinds of null values and present an encoding for doing that in standard SQL databases. With this technique, any SQL DBMS evaluates complete queries correctly with respect to the different meanings that nulls can carry. We study the complexity of completeness reasoning and provide algorithms that in most cases agree with the worst-case lower bounds.
Year
DOI
Venue
2012
10.1145/2396761.2396875
CIKM
Keywords
Field
DocType
different meaning,sql dbms,standard sql databases,table completeness statement,data completeness,complete query,completeness reasoning,different kind,null value,different interpretation,data quality
SQL,Data mining,Data quality,Information retrieval,Computer science,Query by Example,Completeness (statistics),Metadata management,Database,Null (SQL),Encoding (memory)
Conference
Citations 
PageRank 
References 
6
0.56
13
Authors
2
Name
Order
Citations
PageRank
Werner Nutt12009395.43
Simon Razniewski215727.07