Title
IQN routing: integrating quality and novelty in P2P querying and ranking
Abstract
We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query routing: selecting a small subset of (a potentially very large number of relevant) peers to contact to satisfy a keyword query. Existing approaches for query routing work well on disjoint data sets. However, naturally, the peers’ data collections often highly overlap, as popular documents are highly crawled. Techniques for estimating the cardinality of the overlap between sets, designed for and incorporated into information retrieval engines are very much lacking. In this paper we present a comprehensive evaluation of appropriate overlap estimators, showing how they can be incorporated into an efficient, iterative approach to query routing, coined Integrated Quality Novelty (IQN). We propose to further enhance our approach using histograms, combining overlap estimation with the available score/ranking information. Finally, we conduct a performance evaluation in MINERVA, our prototype P2P Web search engine.
Year
DOI
Venue
2006
10.1007/11687238_12
EDBT
Field
DocType
Volume
Web search engine,Data mining,Bloom filter,Web search query,Conjunctive query,Search engine,Ranking,Query expansion,Peer-to-peer,Computer science,Database
Conference
3896
ISSN
ISBN
Citations 
0302-9743
3-540-32960-9
15
PageRank 
References 
Authors
0.62
31
13
Name
Order
Citations
PageRank
Sebastian Michel194658.72
matthias bender230914.34
Peter Triantafillou31261151.76
Gerhard Weikum4127102146.01
Yannis E. Ioannidis554971988.40
Marc H. Scholl61336454.75
Joachim W. Schmidt71147919.40
Florian Matthes81386424.99
michael hatzopoulos912911.25
klemens boehm10160.97
Alfons Kemper113519769.50
Torsten Grust121482148.79
christian boehm13150.62