Title
Taking Risks with Confidence
Abstract
Risk-based evaluation is a failure analysis tool that can be combined with traditional effectiveness metrics to ensure that the improvements observed are consistent across topics when comparing systems. Here we explore the stability of confidence intervals in inference-based risk measurement, extending previous work to five different commonly used inference testing techniques. Using the Robust04 and TREC Core 2017 NYT corpora, we show that risk inferences using parametric methods appear to disagree with their non-parametric counterparts, warranting further investigation. Additionally, we explore how the number of topics being evaluated affects confidence interval stability, and find that more than 50 topics appear to be required before risk-sensitive comparison results are consistent across different inference testing frameworks.
Year
DOI
Venue
2019
10.1145/3372124.3372125
Proceedings of the 24th Australasian Document Computing Symposium
Keywords
DocType
ISBN
Risk-biased evaluation, confidence interval, effectiveness metric
Conference
978-1-4503-7766-9
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Rodger Benham173.19
Ben Carterette2215.08
Alistair Moffat35913728.91
Shane Culpepper451947.52