Abstract | ||
---|---|---|
With the growing popularity of information retrieval (IR) in distributed systems and in particular P2P Web search, a huge number of protocols and prototypes have been intro- duced in the literature. However, nearly every paper con- siders a dierent benchmark for its experimental evaluation, rendering their mutual comparison and the quantification of performance improvements an impossible task. We present a standardized, general purpose benchmark for P2P IR systems that finally makes this possible. We start by presenting a detailed requirement analysis for such a standardized benchmark framework that allows for repro- ducible and comparable experimental setups without sacri- ficing flexibility to suit dierent system models. We further suggest Wikipedia as a publicly-available and all-purpose document corpus and finally introduce a simple but yet flexi- ble clustering strategy that assigns the Wikipedia articles as documents to an arbitrary number of peers. After propos- ing a standardized, real-world query set as the benchmark workload, we review the metrics to evaluate the benchmark results and present an example benchmark run for our fully- implemented P2P Web search prototype MINERVA. |
Year | Venue | Keywords |
---|---|---|
2006 | ExpDB | distributed system,requirement analysis,information retrieval,system modeling,p2p |
Field | DocType | Citations |
Data mining,General purpose,Information retrieval,Computer science,Workload,Popularity,Requirements analysis,Rendering (computer graphics),Cluster analysis,SDET | Conference | 18 |
PageRank | References | Authors |
0.79 | 21 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thomas Neumann | 1 | 2523 | 156.50 |
matthias bender | 2 | 309 | 14.34 |
Sebastian Michel | 3 | 946 | 58.72 |
Gerhard Weikum | 4 | 12710 | 2146.01 |
philippe bonnet | 5 | 18 | 0.79 |
Ioana Manolescu | 6 | 2630 | 235.86 |