Abstract | ||
---|---|---|
This paper tackles issue on comparing eval- uation results using multiple QA test collec- tions(NTCIR QAC1 and 2). We identify two fea- tures that have moderate correlation with the per- formance of systems in QAC1 and 2 and evaluate the diculty of the two test collections using the features. Answer categories of questions also af- fect the performance of systems. The evaluation results suggest that QAC2 seems to be easier than QAC1 in terms of the features, and we are making progress at least for some categories. We make a proposal for the future QAC tasks, as regards to the data needed for evaluation using multiple test collections. |
Year | Venue | Field |
---|---|---|
2004 | NTCIR | Information retrieval,Computer science |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
3 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Masako Nomoto | 1 | 0 | 0.34 |
Yoshio Fukushige | 2 | 10 | 1.63 |
Mitsuhiro Sato | 3 | 0 | 0.68 |
H. Suzuki | 4 | 238 | 31.31 |