Wikipedia Entity Retrieval for Dutch and Spanish. - Citegraph

Paper Info

Title
Wikipedia Entity Retrieval for Dutch and Spanish.

Abstract
We developed two systems (for Dutch and Spanish) for the GikiCLEF task, in which Wikipedia pages have to be found that match a description in natural language. We concentrated on linguistic analysis of the query, for mapping the question onto the most relevant Wikipedia categories, and for extracting additional constraints that matching pages have to satisfy. In addition, for Spanish we experimented with query expansion for improved recall of the IR process. In both the Dutch and Spanish system we tried to incorporate additional knowledge sources (WordNet, Yago, DbPedia) for better question analysis and retrieval results. The Dutch system obtained a GikiCLEF score of 2.5 (7th overall and 7th for Dutch). The Spanish system was still under development at the time of the ocial evaluation, and performed poorly. We show that the completed system would have performed well at the 2009 task.

Year	Venue	Keywords
2009	CLEF (Working Notes)	spanish,linguistic analysis,wikipedia,dutch,entity ranking
Field	DocType	Citations
Information retrieval,Query expansion,Computer science,Question analysis,Natural language,Artificial intelligence,Natural language processing,WordNet,Recall,Linguistic analysis	Conference	0
PageRank	References	Authors
0.34	4	2

Authors (2 rows)

Cited by (0 rows)

References (4 rows)

Name	Order	Citations	PageRank
Gosse Bouma	1	483	70.88
Sergio Duarte	2	0	0.34

1