Title
Beyond lazy XML parsing
Abstract
XML has become the standard format for data representation and exchange in domains ranging from Web to desktop applications. However, wide adoption of XML is hindered by inefficient document-parsing methods. Recent work on lazy parsing is a major step towards alleviating this problem. However, lazy parsers must still read the entire XML document in order to extract the overall document structure, due to the lack of internal navigation pointers inside XML documents. Further, these parsers must load and parse the entire virtual document tree into memory during XML query processing. These overheads significantly degrade the performance of navigation operations. We have developed a framework for efficient XML parsing based on the idea of placing internal physical pointers within the document, which allows skipping large portions of the document during parsing. The internal pointers are generated in a way that optimizes parsing for common navigation patterns. A double-Lazy Parser (2LP) is then used to parse the document that exploits the internal pointers. To create the internal pointers, we use constructs supported by the current W3C XML standard. We study our pointer generation and parsing algorithms both theoretically and experimentally, and show that they perform considerably better than existing approaches.
Year
DOI
Venue
2007
10.1007/978-3-540-74469-6_9
DEXA
Keywords
Field
DocType
entire xml document,lazy xml parsing,internal physical pointer,overall document structure,entire virtual document tree,w3c xml standard,internal pointer,efficient xml,xml query processing,xml document,internal navigation pointer,data representation,document structure,object model,document object model,xml
Efficient XML Interchange,Streaming XML,Programming language,Well-formed document,XML validation,Computer science,Document Structure Description,XML schema,Simple API for XML,Database,Document type definition
Conference
Volume
ISSN
ISBN
4653
0302-9743
3-540-74467-3
Citations 
PageRank 
References 
6
0.59
12
Authors
3
Name
Order
Citations
PageRank
Fernando Farfán1374.13
Vagelis Hristidis22814185.78
Raju Rangaswami375041.17