Title
A supervised framework for resolving coreference in clinical records.
Abstract
Objective A method for the automatic resolution of coreference between medical concepts in clinical records. Materials and methods A multiple pass sieve approach utilizing support vector machines (SVMs) at each pass was used to resolve coreference. Information such as lexical similarity, recency of a concept mention, synonymy based on Wikipedia redirects, and local lexical context were used to inform the method. Results were evaluated using an unweighted average of MUC, CEAF, and B-3 coreference evaluation metrics. The datasets used in these research experiments were made available through the 2011 i2b2/VA Shared Task on Coreference. Results The method achieved an average F score of 0.821 on the ODIE dataset, with a precision of 0.802 and a recall of 0.845. These results compare favorably to the best-performing system with a reported F score of 0.827 on the dataset and the median system F score of 0.800 among the eight teams that participated in the 2011 i2b2/VA Shared Task on Coreference. On the i2b2 dataset, the method achieved an average F score of 0.906, with a precision of 0.895 and a recall of 0.918 compared to the best F score of 0.915 and the median of 0.859 among the 16 participating teams. Discussion Post hoc analysis revealed significant performance degradation on pathology reports. The pathology reports were characterized by complex synonymy and very few patient mentions. Conclusion The use of several simple lexical matching methods had the most impact on achieving competitive performance on the task of coreference resolution. Moreover, the ability to detect patients in electronic medical records helped to improve coreference resolution more than other linguistic analysis.
Year
DOI
Venue
2012
10.1136/amiajnl-2012-000810
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
Field
DocType
Volume
Data mining,F1 score,Lexical similarity,Coreference,Information retrieval,Computer science,Support vector machine,Artificial intelligence,Natural language processing,Recall,Linguistic analysis
Journal
19
Issue
ISSN
Citations 
5
1067-5027
3
PageRank 
References 
Authors
0.37
35
3
Name
Order
Citations
PageRank
Bryan Rink120514.73
Kirk Roberts233439.86
Sanda Harabagiu32203221.65