Title
Protein remote homology detection and fold recognition based on Sequence-Order Frequency Matrix
Abstract
Protein remote homology detection and fold recognition are two critical tasks for the studies of protein structures and functions. Currently, the profile-based methods achieve the state-of-the-art performance in these fields. However, the widely used sequence profiles, like Position-Specific Frequency Matrix (PSFM) and Position-Specific Scoring Matrix (PSSM), ignore the sequence-order effects along protein sequence. In this study, we have proposed a novel profile, called Sequence-Order Frequency Matrix (SOFM), to extract the sequence-order information of neighboring residues from Multiple Sequence Alignment (MSA). Combined with two profile feature extraction approaches: Top-n-grams and Smith-Waterman algorithm, the SOFMs are applied to protein remote homology detection and fold recognition, and two predictors called SOFM-Top and SOFM-SW are proposed. Experimental results show that SOFM contains more information content than other profiles, and these two predictors outperform other state-of-the-art methods. It is anticipated that SOFM will become a very useful profile in the studies of protein structures and functions. IEEE
Year
DOI
Venue
2019
10.1109/TCBB.2017.2765331
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Keywords
DocType
Volume
Amino acids,Benchmark testing,Data mining,Feature extraction,Hidden Markov models,Kernel,protein fold recognition,protein remote homology detection,Proteins,Sequence-Order Frequency Matrix,Smith-Waterman Local Alignment algorithm,Top-n-gram
Journal
16
Issue
ISSN
Citations 
1
15455963
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Bin Liu141933.30
Junjie Chen2763.24
Guo M.361.10
Xiaolong Wang41208115.39