Natural Language Information Retrieval: TREC-7 Report - Citegraph

Paper Info

Title
Natural Language Information Retrieval: TREC-7 Report

Abstract
1. Summary The GE/Rutgers/SICS/Helsinki t eam has performed runs in the main ad-hoc task. All submissions are NLP-assisted retrieval. We used two retrieval engines: SMART and InQuery built into the stream model architecture where each stream represents an alternative text indexing method. The processing of TREC data was performed at Helsinki using the c ommercial Functional Dependency Grammar (FDG) text processing toolkit. Six linguistic streams have been produced, described below. Processed text streams were sent via ftp to Rutgers for indexing. Indexing was done using Inquery system. Additionally, 4 steams produced by GE NLToolset for TREC-6 were reused in SMART indexing. Adhoc topics were processed at GE using both automatic and manual topic expansion. We used the inter- active Query Expansion Tool t o expand topics with automatically generated summaries of top 30 docu- ments retrieved by the original t opic. Manual i ntervention was restricted to accept/reject decisions on summaries. We observed time limit of 10 minutes per topic. Automatic topics expansion was done by replacing human summary selection by an automatic procedure, which accepted only the summaries that obtained sufficiently high scores. Two sets of expanded topics (automatic and manual) were sent to Hel- sinki for NL processing, and then on to Rutgers for retrieval. Rankings were obtained from each stream index and then merged using a combined strategy developed at GE and SICS.

Year	Venue	Keywords
1998	TREC	architecture,flavor,hierarchies,indexes,query expansion,document retrieval,statistics,engines,natural language processing,english language,information processing,information retrieval,mercury,prototypes,search engine,functional dependency,indexing terms,indexation,methodology,data bases,predictions,information retrieval system,inversion,sensitivity,inverted index,first principle,natural language
Field	DocType	Citations
Weighting,Ranking,Information retrieval,Computer science,Search engine indexing,Natural language,Artificial intelligence,Natural language processing,Document retrieval,Text Retrieval Conference,Parsing,Text processing	Conference	55
PageRank	References	Authors
6.39	6	8

Authors (8 rows)

Cited by (55 rows)

References (6 rows)

Name	Order	Citations	PageRank
Tomek Strzalkowski	1	886	200.02
Gees C. Stein	2	79	12.00
G. Bowden Wise	3	139	23.33
Jose Perez Carballo	4	193	44.60
Pasi Tapanainen	5	512	101.53
Timo Järvinen	6	323	56.75
Atro Voutilainen	7	314	68.94
Jussi Karlgren	8	831	140.24

1