Title | ||
---|---|---|
DeepText2Go: Improving large-scale protein function prediction with deep semantic text representation |
Abstract | ||
---|---|---|
UniProtKB has collected more than 88 million protein sequences by July 2017. Less than 0.2% of these proteins, however, have added experimental GO annotations. To reduce this huge gap, automatic protein function prediction (AFP) becomes increasingly important. Results on CAFA (the Critical Assessment of protein Function Annotation algorithms) benchmark demonstrates that sequence homology based methods are highly competitive in AFP. One imperative issues will be incorporating other information sources other than sequence for AFP. In contrast to using BOW (bag of words) representation in traditional text-based AFP, we proposed a new method called DeepText2GO to improve large-scale AFP by using deep semantic text representation instead. Furthermore, DeepText2GO integrates both text-based and sequence homology-based methods through a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence homology-based methods, validating its superiority. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/BIBM.2017.8217622 | 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) |
Keywords | DocType | ISSN |
DeepText2Go,large-scale protein function prediction,deep semantic text representation,experimental GO annotations,automatic protein function prediction,large-scale AFP,protein sequences,sequence homology | Conference | 2156-1125 |
ISBN | Citations | PageRank |
978-1-5090-3051-4 | 0 | 0.34 |
References | Authors | |
0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ronghui You | 1 | 2 | 1.40 |
Shanfeng Zhu | 2 | 429 | 35.04 |