Title
Automatic repairing of web wrappers
Abstract
We study the problem of automatic repairing of wrappers for Web information providers. Majority of Web wrappers use "hooks'' or "landmarks'' to find and extract relevant information from Web pages and such wrappers often become inoperable when the page structure is changed. The solution we propose in this paper extends conventional forward wrappers with alternative classifiers built using content features of extracted information and wrappers processing pages backward. We report some preliminary results of the information extraction recovery and wrapper repairing for a set of real Web provider changes.
Year
DOI
Venue
2001
10.1145/502932.502938
WIDM
Keywords
Field
DocType
information extraction recovery,web wrapper,alternative classifier,real web provider change,conventional forward wrapper,relevant information,page structure,web information provider,content feature,web page,information extraction,web pages
Static web page,Data interoperability,Data mining,World Wide Web,Web page,Information retrieval,Computer science,Information extraction,Web information
Conference
ISBN
Citations 
PageRank 
1-58113-444-4
15
0.79
References 
Authors
8
1
Name
Order
Citations
PageRank
Boris Chidlovskii141152.58