Title
A Statistical Model For Multilingual Entity Detection And Tracking
Abstract
Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for identifying and tracking named, nominal and pronominal references to entities within unrestricted text documents, and chaining them into clusters corresponding to each logical entity present in the text. Both the mention detection model and the novel entity tracking model can use arbitrary feature types, being able to integrate a wide array of lexical, syntactic and semantic features. In addition, the mention detection model crucially uses feature streams derived from different named entity classifiers. The proposed framework is evaluated with several experiments run in Arabic, Chinese and English texts; a system based on the approach described here and submitted to the latest Automatic Content Extraction (ACE) evaluation achieved top-tier results in all three evaluation languages.
Year
Venue
Keywords
2004
HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE
semantics,syntax,extraction,natural language,statistical analysis,tracking,statistical model
Field
DocType
Citations 
Chaining,Arabic,Computer science,Automatic Content Extraction,Named entity,Natural language,Natural language processing,Artificial intelligence,Statistical model,Syntax,Semantics
Conference
93
PageRank 
References 
Authors
12.03
16
8
Name
Order
Citations
PageRank
Radu Florian192491.44
Hany Hassan227726.16
Abraham Ittycheriah353461.23
Hongyan Jing41524112.18
Nanda Kambhatla539051.52
Xiaoqiang Luo671152.14
Nicolas Nicolov740076.27
Salim Roukos86248845.50