deepBioWSD: effective deep neural word sense disambiguation of biomedical text data. - Citegraph

Paper Info

Title
deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.

Abstract
Objective: In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable. Materials and Methods: Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner. Results: We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy. Conclusions: Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.

Year	DOI	Venue
2019	10.1093/jamia/ocy189	JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
Keywords	Field	DocType
word sense disambiguation,biomedical text mining,deep neural networks,bidirectional long short-term memory network,zero-shot learning	Data mining,Natural language processing,Artificial intelligence,Medicine,Word-sense disambiguation	Journal
Volume	Issue	ISSN
26	5	1067-5027
Citations	PageRank	References
1	0.35	18
Authors
4

Authors (4 rows)

Cited by (1 rows)

References (18 rows)

Name	Order	Citations	PageRank
Ahmad Pesaranghader	1	28	4.20
Stan Matwin	2	3025	344.20
Marina Sokolova	3	720	28.40
Ali Pesaranghader	4	29	3.16

1