Title
On the Codd semantics of SQL nulls
Abstract
Theoretical models used in database research often have subtle differences with those occurring in practice. One particular mismatch that is usually neglected concerns the use of marked nulls to represent missing values in theoretical models of incompleteness, while in an SQL database these are all denoted by the same syntactic object. It is commonly argued that results obtained in the model with marked nulls carry over to SQL, because SQL nulls can be interpreted as Codd nulls, which are simply marked nulls that do not repeat. This argument, however, does not take into account that even simple queries may produce answers where distinct occurrences of do in fact denote the same unknown value. For such queries, interpreting SQL nulls as Codd nulls would incorrectly change the semantics of query answers. To use results about Codd nulls for real-life SQL queries, we need to understand which queries preserve the Codd interpretation of SQL nulls. We show, however, that the class of relational algebra queries preserving Codd interpretation is not recursively enumerable, which necessitates looking for sufficient conditions for such preservation. Those can be obtained by exploiting the information provided by NOT NULL constraints on the database schema. We devise mild syntactic restrictions on queries that guarantee preservation, do not limit the full expressiveness of queries on databases without nulls, and can be checked efficiently.
Year
DOI
Venue
2017
10.1016/j.is.2018.08.001
Information Systems
Keywords
Field
DocType
SQL,Null,Semantics,Relational database
SQL,Data mining,Programming language,Computer science,Recursively enumerable language,Database schema,Relational algebra,Missing data,Syntax,Semantics,Expressivity
Conference
Volume
ISSN
Citations 
86
0306-4379
0
PageRank 
References 
Authors
0.34
5
2
Name
Order
Citations
PageRank
Paolo Guagliardo17410.53
Leonid Libkin23446764.02