Nuggeteer: automatic nugget-based evaluation using descriptions and judgements - Citegraph

Paper Info

Title
Nuggeteer: automatic nugget-based evaluation using descriptions and judgements

Abstract
The TREC Definition and Relationship questions are evaluated on the basis of information nuggets that may be contained in system responses. Human evaluators provide informal descriptions of each nugget, and judgements (assignments of nuggets to responses) for each response submitted by participants. While human evaluation is the most accurate way to compare systems, approximate automatic evaluation becomes critical during system development.We present Nuggeteer, a new automatic evaluation tool for nugget-based tasks. Like the first such tool, Pourpre, Nuggeteer uses words in common between candidate answer and answer key to approximate human judgements. Unlike Pourpre, but like human assessors, Nuggeteer creates a judgement for each candidate-nugget pair, and can use existing judgements instead of guessing. This creates a more readily interpretable aggregate score, and allows developers to track individual nuggets through the variants of their system. Nuggeteer is quantitatively comparable in performance to Pourpre, and provides qualitatively better feedback to developers.

Year	DOI	Venue
2006	10.3115/1220835.1220883	HLT-NAACL
Keywords	Field	DocType
automatic nugget-based evaluation,system response,new automatic evaluation tool,approximate automatic evaluation,human evaluation,system development,relationship question,human assessor,approximate human judgement,human evaluator,candidate answer,natural language,question answering	Question answering,Computer science,Judgement,Natural language,Artificial intelligence,Natural language processing	Conference
Citations	PageRank	References
12	1.27	5
Authors
2

Authors (2 rows)

Cited by (12 rows)

References (5 rows)

Name	Order	Citations	PageRank
Gregory Marton	1	139	13.19
Alexey Radul	2	35	8.90

1