Title
Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs.
Abstract
This article explores how measures of semantic similarity and relatedness are impacted by the semantic groups to which the concepts they are measuring belong. Our goal is to determine if there are distinctions between homogeneous comparisons (where both concepts belong to the same group) and heterogeneous ones (where the concepts are in different groups). Our hypothesis is that the similarity measures will be significantly affected since they rely on hierarchical is-a relations, whereas relatedness measures should be less impacted since they utilize a wider range of relations. In addition, we also evaluate the effect of combining different measures of similarity and relatedness. Our hypothesis is that these combined measures will more closely correlate with human judgment, since they better reflect the rich variety of information humans use when assessing similarity and relatedness.We evaluate our method on four reference standards. Three of the reference standards were annotated by human judges for relatedness and one was annotated for similarity.We found significant differences in the correlation of semantic similarity and relatedness measures with human judgment, depending on which semantic groups were involved. We also found that combining a definition based relatedness measure with an information content similarity measure resulted in significant improvements in correlation over individual measures.The semantic similarity and relatedness package is an open source program available from http://umls-similarity.sourceforge.net/. The reference standards are available at http://www.people.vcu.edu/∼{}btmcinnes/downloads.html.
Year
DOI
Venue
2015
10.1016/j.jbi.2014.11.014
Journal of Biomedical Informatics
Keywords
Field
DocType
natural language processing,nlp,semantic relatedness,semantic similarity
Semantic similarity,Data mining,Similarity measure,Information retrieval,Computer science,Homogeneous,Human judgment,Correlation,Artificial intelligence,Natural language processing,Reference standards,Unified Medical Language System
Journal
Volume
Issue
ISSN
54
C
1532-0464
Citations 
PageRank 
References 
6
0.45
20
Authors
2
Name
Order
Citations
PageRank
Bridget T. McInnes128023.66
Ted Pedersen22738220.47