Abstract | ||
---|---|---|
Health is closely related to everyone. Integrating different medical data sets will bring tremendous value for human. Basing on Chinese and English disease medical term, we use text mining technique in terms of two dimensions of the disease from the name and text description of the semantic clustering to achieve initial alignment disease terminology. First, we translate the Chinese data set through the API translation. Then we assign weights for each feature item to obtain feature vector for each disease node disease. Finally, we calculate the similarity of diseases and K-means clustering. We conduct experiments to evaluate the method on real-world and authoritative dataset, and the results prove that it has better rationality and superiority. The method can be extended to the initial alignment of multilingual texts with the same concept after improving. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/IIKI.2016.26 | 2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI) |
Keywords | Field | DocType |
Term alignment,Disease,Data fusion,Text clustering,Data mining | Ontology (information science),Feature vector,Data set,Rationality,Terminology,Computer science,Computer network,Data pre-processing,Natural language processing,Artificial intelligence,Cluster analysis,Semantics | Conference |
ISBN | Citations | PageRank |
978-1-5090-5953-9 | 0 | 0.34 |
References | Authors | |
2 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yuqi Yang | 1 | 0 | 1.35 |
Guangzhi Zhang | 2 | 2 | 2.05 |
Rongfang Bie | 3 | 547 | 68.23 |
Sungjoong Kim | 4 | 0 | 0.34 |
Dong-Il Shin | 5 | 158 | 34.61 |