Title
The XML web: a first study
Abstract
Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML will become the lingua franca of the Web, eventually replacing HTML. Not surprisingly, there has been a great deal of interest on XML both in industry and in academia. Nevertheless, to date no comprehensive study on the XML Web (i.e., the subset of the Web made of XML documents only) nor on its contents has been made. This paper is the first attempt at describing the XML Web and the documents contained in it. Our results are drawn from a sample of a repository of the publicly available XML documents on the Web, consisting of about 200,000 documents. Our results show that, despite its short history, XML already permeates the Web, both in terms of generic domains and geographically. Also, our results about the contents of the XML Web provide valuable input for the design of algorithms, tools and systems that use XML in one form or another.
Year
DOI
Venue
2003
10.1145/775152.775223
WWW
Keywords
Field
DocType
great deal,short history,important role,xml web,lingua franca,comprehensive study,available xml document,xml document,large-scale electronic publishing,generic domain,electronic publishing,xml documents,statistical analysis
XML Base,World Wide Web,XML Encryption,Efficient XML Interchange,Streaming XML,Information retrieval,XML validation,Computer science,Document Structure Description,XML Schema Editor,XML Signature
Conference
ISBN
Citations 
PageRank 
1-58113-680-3
83
4.14
References 
Authors
14
3
Name
Order
Citations
PageRank
Laurent Mignet134529.41
Denilson Barbosa261043.52
Pierangelo Veltri364882.26