Title
An Automatic Approach for Constructing a Knowledge Base of Symptoms in Chinese
Abstract
While a large number of well-known knowledge bases (KBs) in life science have been published as Linked Open Data, there are few KBs in Chinese. However, KBs of life science in Chinese are necessary when we want to automatically process and analyze electronic medical records (EMRs) in Chinese. Of all, the symptom KB in Chinese is the most seriously in need, since symptoms are the starting point of clinical diagnosis. Furthermore, expressions used in describing symptoms in clinical practice are diverse, which makes it hard to collect such a KB. In this paper, we publish a public KB of symptoms in Chinese. The KB is constructed by fusing data automatically extracted from eight mainstream healthcare websites, three Chinese encyclopedia sites, and symptoms extracted from a large number of EMRs as supplements. As a result, the KB has more than 26,000 distinct symptoms in Chinese including 3,968 symptoms in traditional Chinese medicine (TCM) and 1,029 synonym pairs for symptoms. The KB also includes concepts such as diseases and medicines as well as relations between symptoms and the above related entities. We also link our KB to the Unified Medical Language System (UMLS) and analyze the differences between symptoms in the two KBs. We released the KB as Linked Open Data and a demo at https://datahub.io/ dataset/symptoms-in-chinese.
Year
DOI
Venue
2017
10.1186/s13326-017-0145-x
J. Biomedical Semantics
Keywords
DocType
Volume
Information extraction,Knowledge base,Linked data,Symptoms in Chinese
Journal
8-S
Issue
ISSN
Citations 
1
2041-1480
5
PageRank 
References 
Authors
0.55
20
7
Name
Order
Citations
PageRank
Tong Ruan14914.79
Mengjie Wang250.55
Jian Sun350.55
Ting Wang4725120.28
Lu Zeng561.94
Yichao Yin660.90
Ju Gao763.26