Title
A scalable and effective full-text search in P2P networks
Abstract
We consider the problem of full-text search involving multi-term queries in a network of self-organizing, autonomous peers. Existing approaches do not scale well with respect to the number of peers, because they either require access to a large number of peers or incur a high communication cost in order to achieve good query results. In this paper, we present a novel algorithmic framework for processing multi-term queries in P2P networks that achieves high recall while using (per-query) a small number of peers and a low communication cost, thereby enabling high query throughput. Our approach is based on per-query peer-selection strategy using two-dimensional histograms of score distributions. A full utilization of the histograms incurs a high communication cost. We show how to drastically reduce this cost by employing a two-phase peer-selection algorithm. We also describe an adaptive approach to peer selection that further increases the recall. Experiments on a large real-world collection show that the recall is indeed high while the number of involved peers and the communication cost are low.
Year
DOI
Venue
2009
10.1145/1645953.1646281
CIKM
Keywords
Field
DocType
effective full-text search,good query result,multi-term query,communication cost,adaptive approach,p2p network,high recall,large number,high communication cost,low communication cost,small number,high query throughput,p2p,self organization,clustering,histograms
Small number,Query throughput,Data mining,Histogram,Computer science,Full text search,Cluster analysis,Recall,Scalability
Conference
Citations 
PageRank 
References 
4
0.42
6
Authors
3
Name
Order
Citations
PageRank
Yosi Mass157460.91
Yehoshua Sagiv253621575.95
Michal Shmueli-Scheuer38916.11