Abstract | ||
---|---|---|
Identifying relatedness among diseases could help deepen understanding for the underlying pathogenic mechanisms of diseases, and facilitate drug repositioning projects. A number of methods for computing disease similarity had been developed; however, none of them were designed to utilize information of the entire protein interaction network, using instead only those interactions involving disease causing genes. Most of previously published methods required gene-disease association data, unfortunately, many diseases still have very few or no associated genes, which impeded broad adoption of those methods. In this study, we propose a new method (MedNetSim) for computing disease similarity by integrating medical literature and protein interaction network. MedNetSim consists of a network-based method (NetSim), which employs the entire protein interaction network, and a MEDLINE-based method (MedSim), which computes disease similarity by mining the biomedical literature.Among function-based methods, NetSim achieved the best performance. Its average AUC (area under the receiver operating characteristic curve) reached 95.2 %. MedSim, whose performance was even comparable to some function-based methods, acquired the highest average AUC in all semantic-based methods. Integration of MedSim and NetSim (MedNetSim) further improved the average AUC to 96.4 %. We further studied the effectiveness of different data sources. It was found that quality of protein interaction data was more important than its volume. On the contrary, higher volume of gene-disease association data was more beneficial, even with a lower reliability. Utilizing higher volume of disease-related gene data further improved the average AUC of MedNetSim and NetSim to 97.5 % and 96.7 %, respectively.Integrating biomedical literature and protein interaction network can be an effective way to compute disease similarity. Lacking sufficient disease-related gene data, literature-based methods such as MedSim can be a great addition to function-based algorithms. It may be beneficial to steer more resources torward studying gene-disease associations and improving the quality of protein interaction data. Disease similarities can be computed using the proposed methods at http:// www.digintelli.com:8000/ . |
Year | DOI | Venue |
---|---|---|
2016 | 10.1186/s12859-016-1205-4 | BMC Bioinformatics |
Keywords | Field | DocType |
Disease similarity,MedNetSim,MedSim,NetSim,Random walk with Restart | Drug repositioning,Disease,Protein Interaction Map,Similarity computation,Computer science,Interaction network,Network data,Bioinformatics,MEDLINE | Journal |
Volume | Issue | ISSN |
17 | 1 | 1471-2105 |
Citations | PageRank | References |
0 | 0.34 | 14 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ping Li | 1 | 0 | 0.34 |
Yaling Nie | 2 | 0 | 0.34 |
Jingkai Yu | 3 | 70 | 4.20 |