Abstract | ||
---|---|---|
Some aspects of Data Fusion (DF) for Information Retrieval (IR) are explored using a set of data from the Fifth International Conference on Text Retrieval, TREC5. It has been observed from time to time that DF applied to a pair of systems or schemes for IR may yield results that are better than those of either participating scheme. It has been conjectured that this occurs only rarely, or occurs only when poor schemes are being combined, or occurs only for problems in which there are so few relevant documents that the results are probably due to statistical fluctuation. Based on a geometrical model of DF, we derive an equation for effective DF. This equation shows that in the ideal case the performance of a pair of IR schemes may be aproximated by a quadratic polynomial. We statistically test this assumption for TREC5 Routing data. Results of the regression analysis shows that our equation for the effect of DF is generally valid. |
Year | DOI | Venue |
---|---|---|
2002 | 10.1002/meet.1450390114 | PROCEEDINGS OF THE ASIST ANNUAL MEETING |
Keywords | Field | DocType |
geometric model,data processing,mathematical formulas,data fusion,geometry,information retrieval | Data mining,Data processing,Regression analysis,Computer science,Geometric modeling,Algorithm,Sensor fusion,Quadratic function,Text retrieval | Conference |
Volume | Issue | ISSN |
39 | 1 | 0044-7870 |
Citations | PageRank | References |
2 | 0.35 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ulukbek Ibraev | 1 | 2 | 0.69 |
Kwong Bor Ng | 2 | 110 | 13.37 |
Paul B. Kantor | 3 | 716 | 115.67 |