Title
Memory-based one-step named-entity recognition: Effects of seed list features, classifier stacking, and unannotated data.
Abstract
We present a memory-based named-entity recognition system that chunks and labels named entities in a oneshot task. Training and testing on CoNLL-2003 shared task data, we measure the effects of three extensions. First, we incorporate features that signal the presence of wordforms in external, language-specific seed (gazetteer) lists. Second, we build a second-stage stacked classifier that corrects first-stage output errors. Third, we add selected instances from classified unannotated data to the training material. The system that incorporates all attains an overall F-rate on the final test set of 78.20 on English and 63.02 on German.
Year
DOI
Venue
2003
10.3115/1119176.1119203
CoNLL
Keywords
Field
DocType
final test,language-specific seed,training material,oneshot task,memory-based one-step named-entity recognition,classified unannotated data,conll-2003 shared task data,seed list feature,first-stage output error,overall f-rate,memory-based named-entity recognition system
Recognition system,Computer science,Speech recognition,Artificial intelligence,Natural language processing,Classifier (linguistics),Named-entity recognition,Machine learning,Stacking,German,Test set
Conference
Citations 
PageRank 
References 
5
1.97
8
Authors
2
Name
Order
Citations
PageRank
Iris Hendrickx128530.91
Antal Van Den Bosch21038132.37