Title
Named Entity Resolution Using Automatically Extracted Semantic Information
Abstract
One major problem in text mining and seman- tic retrieval is that detected entity mentions have to be assigned to the true underlying entity. The ambiguity of a name results from both the pol- ysemy and synonymy problem, as the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term "bush" for instance may refer to a woody plant, a mechanical fixing, a noctur- nal primate, 52 persons and 8 places covered in Wikipedia and thousands of other persons. For the first time, according to our knowledge we apply a kernel entity resolution approach to the German Wikipedia as reference for named enti- ties. We describe the context of named entities in Wikipedia and the context of a detected name phrase in a new document by a context vector of relevant features. These are designed from au- tomatically extracted topic indicators generated by an LDA topic model. We use kernel classi- fiers, e.g. rank classifiers, to determine the right matching entity but also to detect uncovered en- tities. In comparison to a baseline approach us- ing only text similarity the addition of topics ap- proach gives a much higher f-value, which is comparable to the results published for English. It turns out that the procedure also is able to de- tect with high reliability if a person is not covered by the Wikipedia.
Year
Venue
DocType
2009
LWA
Conference
Citations 
PageRank 
References 
5
0.53
14
Authors
2
Name
Order
Citations
PageRank
Anja Pilz1322.83
Gerhard Paass2113683.63