Title
Automatic extraction of relations between medical concepts in clinical texts.
Abstract
Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, Word Net, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available.
Year
DOI
Venue
2011
10.1136/amiajnl-2011-000153
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
Keywords
Field
DocType
support vector machine,relation extraction,feature extraction,gold standard
Data mining,Quality of results,Computer science,Artificial intelligence,Natural language processing,WordNet,Classifier (linguistics),Relationship extraction,Ontology (information science),F1 score,Information retrieval,Unified Medical Language System,Recall
Journal
Volume
Issue
ISSN
18
5
1067-5027
Citations 
PageRank 
References 
50
1.35
7
Authors
3
Name
Order
Citations
PageRank
Bryan Rink120514.73
Sanda Harabagiu22203221.65
Kirk Roberts333439.86