Title
A Data Parallel Approach to XML Parsing and Query
Abstract
Data-parallel XML parsing has a crucial problem in partitioning XML documents. Existing approaches need a pre-parse step to determine the partitions. In this paper, we propose a direct parallel method to solve this problem without pre-parsing. In the direct parallel method, we directly start the parallel parsing by finding the "light tower", which is a particular character with some exceptions, called clues. We handle the exceptions by watching the clues and reparsing the partition if it is required in the parsing stage. We also propose a non-synchronized splitter approach to the parallel XML querying using XPath expressions. In the non-synchronized splitter approach, we split an XPath expression into pieces to be executed by threads and we use a data structure, called the ancestor table, to help each thread handle its part of XPath expression independently without communications between threads. Our experiments show that our approach scales well from small sized files to huge sized files.
Year
DOI
Venue
2011
10.1109/HPCC.2011.74
HPCC
Keywords
Field
DocType
parallel processing,xml parsing,xml,xml querying,parallel xml querying,direct parallel method,xml document partitioning,data structures,vtd-xml,non-synchronized splitter approach,data structure,xpath expression,crucial problem,data parallel,approach scale,data parallel approach,exception handling,parsing stage,nonsynchronized splitter approach,grammars,multi-core,data-parallel xml parsing,ancestor table,document handling,parallel parsing,partitioning xml document,parallel xml parsing,multi core
Efficient XML Interchange,Programming language,Streaming XML,XML Schema (W3C),Computer science,XML validation,XML database,Theoretical computer science,XML schema,XPath,Simple API for XML
Conference
ISBN
Citations 
PageRank 
978-0-7695-4538-7
3
0.38
References 
Authors
11
2
Name
Order
Citations
PageRank
Cheng-Han You130.38
Sheng-De Wang272068.13