Abstract | ||
---|---|---|
In this paper we address the problem of first name and last name identification in a news collection. The approach presented is based on corpus investigation and is language independent. At the core of the system there is a name classifier based on the values of different parameters. In its most general form, the name category identification is not an easy task. The hardest problems are raised by ambiguous tokens - those that can be either a first or a last name and/or by tokens with just one occurrence. However, the system is able to predict the name category with high accuracy. The experiments have been run on an Italian newspaper and the evaluation has been carried on I-CAB. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1007/978-3-540-78135-6_27 | CICLing |
Keywords | Field | DocType |
easy task,last name identification,ambiguous token,italian newspaper,different parameter,last name,name classifier,name category identification,corpus investigation,name category,person name | Computer science,Fully qualified name,Newspaper,Natural language processing,Artificial intelligence,Classifier (linguistics),Linguistics | Conference |
Volume | ISSN | ISBN |
4919 | 0302-9743 | 3-540-78134-X |
Citations | PageRank | References |
0 | 0.34 | 3 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Octavian Popescu | 1 | 78 | 18.05 |
Bernardo Magnini | 2 | 2027 | 226.13 |