Title
A test platform for the INEX heterogeneous track
Abstract
This article presents our work within the INEX 2004 Heterogeneous Track. We focused on taming the structural diversity within the INEX heterogeneous bibliographic corpus. We demonstrate how semantic models and associated inference techniques can be used to solve the problems raised by the structural diversity within a given XML corpus. The first step automatically extracts a set of concepts from each class of INEX heterogeneous documents. An unified set of concepts is then computed, which synthesizes the interesting concepts from the whole corpus. Individual corpora are connected to the unified set of concepts via conceptual mappings. This approach is implemented as an application of the KadoP platform for peer-to-peer warehousing of XML documents. While this work caters to the structural aspects of XML information retrieval, the extensibility of the KadoP system makes it an interesting test platform in which components developed by several INEX participants could be plugged, exploiting the opportunities of peer-to-peer data and service distribution.
Year
DOI
Venue
2004
10.1007/11424550_29
INEX
Keywords
Field
DocType
inex heterogeneous document,inex heterogeneous track,xml corpus,test platform,structural aspect,inex heterogeneous bibliographic corpus,inex participant,unified set,structural diversity,xml information retrieval,xml document,individual corpus,semantic model
World Wide Web,Streaming XML,Information retrieval,XML,XML validation,Inference,Computer science,Xml information retrieval,Extensibility,XML Schema Editor
Conference
Volume
ISSN
ISBN
3493
0302-9743
3-540-26166-4
Citations 
PageRank 
References 
2
0.40
9
Authors
4
Name
Order
Citations
PageRank
Serge Abiteboul190952941.83
Ioana Manolescu22630235.86
Benjamin Nguyen3416.30
Nicoleta Preda417314.40