Abstract | ||
---|---|---|
This article presents our work within the INEX 2004 Heterogeneous Track. We focused on taming the structural diversity within the INEX heterogeneous bibliographic corpus. We demonstrate how semantic models and associated inference techniques can be used to solve the problems raised by the structural diversity within a given XML corpus. The first step automatically extracts a set of concepts from each class of INEX heterogeneous documents. An unified set of concepts is then computed, which synthesizes the interesting concepts from the whole corpus. Individual corpora are connected to the unified set of concepts via conceptual mappings. This approach is implemented as an application of the KadoP platform for peer-to-peer warehousing of XML documents. While this work caters to the structural aspects of XML information retrieval, the extensibility of the KadoP system makes it an interesting test platform in which components developed by several INEX participants could be plugged, exploiting the opportunities of peer-to-peer data and service distribution. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1007/11424550_29 | INEX |
Keywords | Field | DocType |
inex heterogeneous document,inex heterogeneous track,xml corpus,test platform,structural aspect,inex heterogeneous bibliographic corpus,inex participant,unified set,structural diversity,xml information retrieval,xml document,individual corpus,semantic model | World Wide Web,Streaming XML,Information retrieval,XML,XML validation,Inference,Computer science,Xml information retrieval,Extensibility,XML Schema Editor | Conference |
Volume | ISSN | ISBN |
3493 | 0302-9743 | 3-540-26166-4 |
Citations | PageRank | References |
2 | 0.40 | 9 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Serge Abiteboul | 1 | 9095 | 2941.83 |
Ioana Manolescu | 2 | 2630 | 235.86 |
Benjamin Nguyen | 3 | 41 | 6.30 |
Nicoleta Preda | 4 | 173 | 14.40 |