Title
Identifying Novel Information using Latent Semantic Analysis in the WiQA Task at CLEF 2006.
Abstract
In our two-stage system for the English monolingual WiQA Task, snippets were first retrieved if they contained an exact match with the title. Candidates were then passed to the Latent Semantic Analysis component which judged them Novel if their match with the article text was less than a threshold. In Run1, the ten best snippets were returned and in Run 2 the twenty best. Run 1 was superior, with Average Yield per Topic 2.46 and Precision 0.37. Compared to other groups, our performance was in the middle of the range except for Precision where our system was the best. We attribute this to our use of exact title matches in the IR stage. In future work we will vary the approach used depending on the topic type, exploit co-references in conjunction with exact matches and make use of the elaborate hyperlink structure which is a unique and most interesting aspect of the Wikipedia.
Year
DOI
Venue
2006
10.1007/978-3-540-74999-8_66
CLEF (Working Notes)
Keywords
DocType
Volume
average yield,exact match,article text,two-stage system,wiqa task,best snippet,exact title,latent semantic analysis,latent semantic analysis component,english monolingual,identifying novel information,ir stage,indexation
Conference
4730
ISSN
ISBN
Citations 
0302-9743
3-540-74998-5
1
PageRank 
References 
Authors
0.35
7
5
Name
Order
Citations
PageRank
Richard F. E. Sutcliffe131837.67
josef steinberger235526.95
Udo Kruschwitz338755.73
Mijail Alexandrov-Kabadjov410.35
Massimo Poesio51869170.68