Title
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Abstract
Content-based full-text search is a challenging problem in Peer-to-Peer (P2P) systems. Traditional approaches have either been centralized or use flooding to ensure accuracy of the results returned.In this paper, we present pSearch, a decentralized non-flooding P2P information retrieval system. pSearch distributes document indices through the P2P network based on document semantics generated by Latent Semantic Indexing (LSI). The search cost (in terms of different nodes searched and data transmitted) for a given query is thereby reduced, since the indices of semantically related documents are likely to be co located in the network.We also describe techniques that help distribute the indices more evenly across the nodes, and further reduce the number of nodes accessed using appropriate index distribution as well as using index samples and recently processed queries to guide the search.Experiments show that pSearch can achieve performance comparable to centralized information retrieval systems by searching only a small number of nodes. For a system with 128,000 nodes and 528,543 documents (from news, magazines, etc.), pSearch searches only 19 nodes and transmits only 95.5KB data during the search, whereas the top 15 documents returned by pSearch and LSI have a 91.7% intersection.
Year
DOI
Venue
2003
10.1145/863955.863976
SIGCOMM
Keywords
Field
DocType
self-organizing semantic overlay network,document semantics,document index,centralized information retrieval system,appropriate index distribution,peer-to-peer information retrieval,p2p network,p2p information retrieval system,search cost,content-based full-text search,present psearch,psearch search,information retrieval system,self organization,latent semantic indexing,information retrieval,indexation,p2p,overlay network
Latent semantic indexing,Information retrieval,Peer-to-peer,Computer science,Search cost,Overlay network,Semantics
Conference
Volume
Issue
ISSN
33
4
0146-4833
ISBN
Citations 
PageRank 
1-58113-735-4
266
10.01
References 
Authors
19
3
Search Limit
100266
Name
Order
Citations
PageRank
Chunqiang Tang1128775.09
Zhichen Xu2105766.72
Sandhya Dwarkadas33504257.31