Named Entity Resolution Using Automatically Extracted Semantic Information - Citegraph

Paper Info

Title
Named Entity Resolution Using Automatically Extracted Semantic Information

Abstract
One major problem in text mining and seman- tic retrieval is that detected entity mentions have to be assigned to the true underlying entity. The ambiguity of a name results from both the pol- ysemy and synonymy problem, as the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term "bush" for instance may refer to a woody plant, a mechanical fixing, a noctur- nal primate, 52 persons and 8 places covered in Wikipedia and thousands of other persons. For the first time, according to our knowledge we apply a kernel entity resolution approach to the German Wikipedia as reference for named enti- ties. We describe the context of named entities in Wikipedia and the context of a detected name phrase in a new document by a context vector of relevant features. These are designed from au- tomatically extracted topic indicators generated by an LDA topic model. We use kernel classi- fiers, e.g. rank classifiers, to determine the right matching entity but also to detect uncovered en- tities. In comparison to a baseline approach us- ing only text similarity the addition of topics ap- proach gives a much higher f-value, which is comparable to the results published for English. It turns out that the procedure also is able to de- tect with high reliability if a person is not covered by the Wikipedia.

Year	Venue	DocType
2009	LWA	Conference
Citations	PageRank	References
5	0.53	14
Authors
2

Authors (2 rows)

Cited by (5 rows)

References (14 rows)

Name	Order	Citations	PageRank
Anja Pilz	1	32	2.83
Gerhard Paass	2	1136	83.63

1