Abstract | ||
---|---|---|
The XSD binary floating point datatypes are regularly used for precise numeric values in RDF. However, the use of these datatypes for knowledge representation can systematically impair the quality of data and, compared to the XSD decimal datatype, increases the probability of data processing producing false results. We argue why in most cases the XSD decimal datatype is better suited to represent numeric values in RDF. A survey of the actual usage of datatypes on the relevant subset of the December 2020 Web Data Commons dataset, containing 19 453 060 341 literals from real web data, substantiates the practical relevancy of the described problem: 29%-68% of binary floating point values are distorted due to the datatype. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1007/978-3-031-06981-9_10 | SEMANTIC WEB, ESWC 2022 |
Keywords | DocType | Volume |
Data Quality, Datatypes, Floating Point Numbers, Knowledge Graphs, Numerical Stability, RDF, XSD | Conference | 13261 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jan Martin Keil | 1 | 0 | 1.01 |