Title | ||
---|---|---|
AiProAnnotator: Low-rank Approximation with network side information for high-performance, large-scale human Protein abnormality Annotator |
Abstract | ||
---|---|---|
Annotating genes/proteins is a vital issue in biology. Particularly we focus on human proteins and medical annotation, which both are important. The most proper data for our annotation is human phenotype ontology (HPO), which are sparse but reliable (well-curated). Existing approaches for this problem are feature-based or network-based. The feature-based approach can incorporate a variety of information, by which this approach is more appropriate for noisy data than reliable data, while the network-based approach is not necessarily useful for sparse data. Low-rank approximation is very powerful for both sparse and reliable data. We thus propose to use matrix factorization to approximate the input annotation matrix (proteins × HPO terms) by factorized low-rank matrices. We further incorporate network information, i.e. protein-protein network (PPN) and network from HPO (NHPO), into the framework of matrix factorization as graph regularization over the two low-rank matrices. That is, the input annotation matrix is factorized into two low-rank factor matrices so that they can be smooth over PPN and NHPO. We call our software of implementing the above method “AiProAnnotator”, which in this paper has been empirically examined using the latest HPO data extensively under various experimental settings, including performance comparison under cross-validation, computation time and case studies, etc. Experimental results showed the high predictive performance and time efficiency of AiProAnnotator clearly. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/BIBM.2018.8621517 | 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) |
Keywords | Field | DocType |
AiProAnnotator,network side information,human proteins,feature-based approach,network-based approach,sparse data,input annotation matrix,high predictive performance,low-rank approximation,human phenotype ontology,large-scale human protein abnormality,genes-proteins annotation,low-rank matrices factorization,HPO data,protein-protein network,PPN,NHPO,graph regularization | Data mining,Annotation,Computer science,Matrix (mathematics),Matrix decomposition,Software,Low-rank approximation,Artificial intelligence,Sparse matrix,Machine learning,Computation,Human Phenotype Ontology | Conference |
ISSN | ISBN | Citations |
2156-1125 | 978-1-5386-5489-7 | 1 |
PageRank | References | Authors |
0.35 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Junning Gao | 1 | 25 | 1.69 |
Shuwei Yao | 2 | 3 | 1.06 |
Hiroshi Mamitsuka | 3 | 973 | 91.71 |
Shanfeng Zhu | 4 | 429 | 35.04 |