Abstract | ||
---|---|---|
We describe an evaluation of result set filtering techniques for providing ultra-high precision in the task of presenting related news for general web queries. In this task, the negative user experience generated by retrieving non-relevant documents has a much worse impact than not retrieving relevant ones. We adapt cost-based metrics from the document filtering domain to this result filtering problem in order to explicitly examine the tradeoff between missing relevant documents and retrieving non-relevant ones. A large manual evaluation of three simple threshold filters shows that the basic approach of counting matching title terms outperforms also incorporating selected abstract terms based on part-of-speech or higher-level linguistic structures. Simultaneously, leveraging these cost-based metrics allows us to explicitly determine what other tasks would benefit from these alternative techniques. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1145/1008992.1009087 | SIGIR |
Keywords | Field | DocType |
cost-based metrics,missing relevant document,general web query,abstract term,retrieving non-relevant,basic approach,higher-level linguistic structure,alternative technique,non-relevant document,current news search result,large manual evaluation,user experience,evaluation,part of speech,measurement | Data mining,User experience design,Information retrieval,Result set,Computer science,Filter (signal processing),Filtering problem,Document filtering | Conference |
ISBN | Citations | PageRank |
1-58113-881-4 | 2 | 0.43 |
References | Authors | |
5 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Steven M. Beitzel | 1 | 696 | 46.72 |
Eric C. Jensen | 2 | 696 | 46.72 |
Abdur Chowdhury | 3 | 2013 | 160.59 |
David Grossman | 4 | 525 | 34.73 |
Ophir Frieder | 5 | 3300 | 419.55 |