Abstract | ||
---|---|---|
XML documents contain substantial redundancy in their structure part, because each path from the root
node to a leaf node is explicitly represented and typically large sets of such path instances belong to
a path class, i.e., the nodes of the path instances are labeled by the same sequence of element (or attribute)
names. To save storage space and I/O cost, we want to get rid of this structural redundancy to the extent
possible. While all known methods for the physical representation (storage) of XML documents proceed from
the root via the element/attribute hierarchy (internal nodes) down to the leaves (values), we follow an
upside-down approach which explicitly stores the values and only reconstructs
the internal nodes, if needed. The cornerstones for such a solution are suitable node labels and a path
synopsis which efficiently represents all path classes of an XML document. As a solution, we propose a compact
internal storage format for native XML database systems where the inner structure of the stored documents
is virtualized. Because this elementless storage format provides an efficient reconstruction of a document
using its path synopsis, all processing properties are preserved and the semantics of navigational and declarative
operations of XML languages remains unchanged. Adjusted indexes support the full spectrum of so-called
content-and-structure single path queries. Apart from greatly reduced storage consumption, our approach
demonstrates its superiority, compared to competing methods, not only for a substantial fraction of those
queries, but also for storing, reconstructing, and navigating XML documents. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/s00450-009-0056-x | Computer Science - R&D |
Keywords | Field | DocType |
storage formats · xml indexes · native xml database management systems · elementless xml storage · path synopsis · prefix-based node labeling cr subject classification e.2,h.2.4,h.2.2,xml document,management system,indexation,spectrum | XML Encryption,Efficient XML Interchange,Information retrieval,XML validation,Computer science,Parallel computing,Document Structure Description,XML database,Root element,XML schema,Database,XML Catalog | Journal |
Volume | Issue | ISSN |
24 | 1-2 | 1865-2042 |
Citations | PageRank | References |
6 | 0.48 | 32 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christian Mathis | 1 | 147 | 10.87 |
Theo Härder | 2 | 1132 | 307.12 |
Karsten Schmidt | 3 | 25 | 4.82 |