Title
DeepText2Go: Improving large-scale protein function prediction with deep semantic text representation
Abstract
UniProtKB has collected more than 88 million protein sequences by July 2017. Less than 0.2% of these proteins, however, have added experimental GO annotations. To reduce this huge gap, automatic protein function prediction (AFP) becomes increasingly important. Results on CAFA (the Critical Assessment of protein Function Annotation algorithms) benchmark demonstrates that sequence homology based methods are highly competitive in AFP. One imperative issues will be incorporating other information sources other than sequence for AFP. In contrast to using BOW (bag of words) representation in traditional text-based AFP, we proposed a new method called DeepText2GO to improve large-scale AFP by using deep semantic text representation instead. Furthermore, DeepText2GO integrates both text-based and sequence homology-based methods through a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence homology-based methods, validating its superiority.
Year
DOI
Venue
2017
10.1109/BIBM.2017.8217622
2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Keywords
DocType
ISSN
DeepText2Go,large-scale protein function prediction,deep semantic text representation,experimental GO annotations,automatic protein function prediction,large-scale AFP,protein sequences,sequence homology
Conference
2156-1125
ISBN
Citations 
PageRank 
978-1-5090-3051-4
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Ronghui You121.40
Shanfeng Zhu242935.04