Title
Relation extraction and the influence of automatic named-entity recognition
Abstract
We present an approach for extracting relations between named entities from natural language documents. The approach is based solely on shallow linguistic processing, such as tokenization, sentence splitting, part-of-speech tagging, and lemmatization. It uses a combination of kernel functions to integrate two different information sources: (i) the whole sentence where the relation appears, and (ii) the local contexts around the interacting entities. We present the results of experiments on extracting five different types of relations from a dataset of newswire documents and show that each information source provides a useful contribution to the recognition task. Usually the combined kernel significantly increases the precision with respect to the basic kernels, sometimes at the cost of a slightly lower recall. Moreover, we performed a set of experiments to assess the influence of the accuracy of named-entity recognition on the performance of the relation-extraction algorithm. Such experiments were performed using both the correct named entities (i.e., those manually annotated in the corpus) and the noisy named entities (i.e., those produced by a machine learning-based named-entity recognizer). The results show that our approach significantly improves the previous results obtained on the same dataset.
Year
DOI
Venue
2007
10.1145/1322391.1322393
TSLP
Keywords
Field
DocType
sentence splitting,information source,basic kernel,named-entity recognition,kernel methods,named-entity recognizer,combined kernel,kernel function,different type,different information source,relation extraction,automatic named-entity recognition,recognition task,information extraction,kernel method,natural language,machine learning
Tokenization (data security),Pattern recognition,Deep linguistic processing,Computer science,Tree kernel,Information extraction,Artificial intelligence,Natural language processing,Kernel method,Named-entity recognition,Sentence,Relationship extraction
Journal
Volume
Issue
ISSN
5
1
1550-4875
Citations 
PageRank 
References 
10
0.58
24
Authors
3
Name
Order
Citations
PageRank
Claudio Giuliano148833.00
Alberto Lavelli261555.37
Lorenza Romano340622.15