Title
VITAL: Structured and clustered super-peer network for similarity search
Abstract
VITAL is a novel P2P indexing structure that provides on top of exact search a similarity search of multidimensional vectors. It is designed to scale to millions of peers and billions of shared documents and to adapt to high network dynamics. To exploit peer heterogeneity, VITAL is a super-peer (SP) network where every peer is an SP candidate and a simple election protocol is run to select SPs. On the other hand, every SP locally monitors its “vital” signs of memory, processing, and bandwidth and initiates the SP election protocol based on its capacity and load limits. In addition, the SP overlay is structured as CAN distributed hash table to guarantee both the correctness and responsiveness of the query protocol. A novel data replication model is introduced, where every peer clusters its shared documents to local clusters (LCs) and each LC summary is published at few SPs to achieve content-based clustering and firework query propagation. Every peer establishes TCP connections with many SPs that maintain its LC summaries. VITAL has no central component and does not require global knowledge, however it requires identifying a set of global cluster (GC) centriods to be disjointly managed by the elected SPs. In addition, CAN zone overloading is seamlessly applied to relief overwhelmed SPs and it provided an extra layer of physical proximity clustering. The scalability analysis shows that peer index requires less than 3 % of extra storage and a query (on average) can be satisfied by visiting 1.6 % of the number of established TCP connections.
Year
DOI
Venue
2015
10.1007/s12083-014-0304-0
Peer-to-Peer Networking and Applications
Keywords
Field
DocType
Peer-to-Peer (P2P) network,Super-peer network,Distributed hash table (DHT),Content addressable network (CAN),Similarity search
Network dynamics,Replication (computing),Computer science,Correctness,Search engine indexing,Computer network,Cluster analysis,Nearest neighbor search,Distributed computing,Distributed hash table,Scalability
Journal
Volume
Issue
ISSN
8
6
1936-6442
Citations 
PageRank 
References 
0
0.34
75
Authors
3
Name
Order
Citations
PageRank
Sahar M. Ghanem1154.25
Mohamed A. Ismail28615.45
Samia G. Omar330.79