Title
Structuring the Web
Abstract
The WWW is a very large and rich information source but with no structure, so locating data of interest may be difficult. In particular a page may be divided into different logical sections of information, whose highlighting may improve both browsing and searching. We propose a simple Web page structuring, by introducing the "semantic block" as a more granular level to categorize information inside a page. We also propose a set of XML tags to be added to the existing HTML tags in order to locate such blocks and to use structured pages both with current and future, structure-aware browsers, reaching the goal of a gradual migration towards a more structured Web. We explore our technique on several Web sites, in order to detect which semantic blocks are needed, also using two simple Java-based tools we developed to add XML tags and manage such structure. Finally, we consider how schema can be represented for a better browsing.
Year
DOI
Venue
2000
10.1109/DEXA.2000.875167
DEXA Workshop
Keywords
Field
DocType
simple java-based tool,better browsing,different logical section,structured web,existing html tag,xml tag,rich information source,simple web page structure,web site,semantic block,xml,knowledge management,html,java,web pages,html tags,information retrieval,www,data structures,telecommunications,graphics,world wide web
Static web page,Data mining,Web page,Computer science,Structuring,Schema (psychology),HTML element,Printer-friendly,World Wide Web,Information retrieval,XML,Java,Database
Conference
ISBN
Citations 
PageRank 
0-7695-0680-1
3
0.41
References 
Authors
8
3
Name
Order
Citations
PageRank
V. Carchiolo1268.25
A. Longheu261.57
M. Malgeri383.22