Title
A distributional semantics approach to simultaneous recognition of multiple classes of named entities
Abstract
Named Entity Recognition and Classification is being studied for last two decades. Since semantic features take huge amount of training time and are slow in inference, the existing tools apply features and rules mainly at the word level or use lexicons. Recent advances in distributional semantics allow us to efficiently create paradigmatic models that encode word order. We used Sahlgren et al's permutation-based variant of the Random Indexing model to create a scalable and efficient system to simultaneously recognize multiple entity classes mentioned in natural language, which is validated on the GENIA corpus which has annotations for 46 biomedical entity classes and supports nested entities. Using distributional semantics features only, it achieves an overall micro-averaged F-measure of 67.3% based on fragment matching with performance ranging from 7.4% for “DNA substructure” to 80.7% for “Bioentity”.
Year
DOI
Venue
2010
10.1007/978-3-642-12116-6_19
CICLing
Keywords
DocType
Volume
word level,nested entity,random indexing model,multiple entity class,biomedical entity class,entity recognition,encode word order,genia corpus,distributional semantics,simultaneous recognition,multiple class,dna substructure,classification,multiple,indexation,entity,natural language,semantics,word order,biomedical
Conference
6008
ISSN
ISBN
Citations 
0302-9743
3-642-12115-2
7
PageRank 
References 
Authors
1.06
25
4
Name
Order
Citations
PageRank
Siddhartha Jonnalagadda124418.38
Robert Leaman291439.98
Trevor Cohen357953.11
Graciela Gonzalez462439.60