Abstract | ||
---|---|---|
Access to on-line information via the Web is exploding. Index and retrieval engines already start to integrate a huge variety of heterogeneous repositories. However, the heterogeneity issue remains, both in terms of the search formats and the formats of the result pages. In this paper we focus on html-based search and result presentations. We discuss our experience in the design, the development and the maintenance of wrappers (in the context of the Knowledge Broker project). We out- line different ways to write wrappers, illustrate some of the lessons learned, and conclude by describing a semi-automatic approach for an efficient wrapping of Web-based information repositories. Throughout the paper, we give illustrating examples for hands-on readers. |
Year | Venue | Keywords |
---|---|---|
1997 | RIAO | world wide web,heterogeneous repositories,rule-based parsing.,information extraction,wrapping,rule based |
Field | DocType | Citations |
World Wide Web,Knowledge broker,Computer science,Information extraction,Web application | Conference | 11 |
PageRank | References | Authors |
7.97 | 8 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Boris Chidlovskii | 1 | 411 | 52.58 |
Uwe M. Borghoff | 2 | 412 | 175.51 |
Pierre-yves Chevalier | 3 | 67 | 19.27 |