Title
ConDo: Protein domain boundary prediction using coevolutionary information.
Abstract
Motivation Domain boundary prediction is one of the most important problems in the study of protein structure and function. Many sequence-based domain boundary prediction methods are either template-based or machine learning (ML) based. ML-based methods often perform poorly due to their use of only local (i.e. short-range) features. These conventional features such as sequence profiles, secondary structures and solvent accessibilities are typically restricted to be within 20 residues of the domain boundary candidate. Results To address the performance of ML-based methods, we developed a new protein domain boundary prediction method (ConDo) that utilizes novel long-range features such as coevolutionary information in addition to the aforementioned local window features as inputs for ML. Toward this purpose, two types of coevolutionary information were extracted from multiple sequence alignment using direct coupling analysis: (i) partially aligned sequences, and (ii) correlated mutation information. Both the partially aligned sequence information and the modularity of residue-residue couplings possess long-range correlation information. Availability and implementation https://github.com/gicsaw/ConDo.git Supplementary information Supplementary data are available at Bioinformatics online.
Year
DOI
Venue
2019
10.1093/bioinformatics/bty973
BIOINFORMATICS
Field
DocType
Volume
Data mining,Protein domain,Computer science
Journal
35
Issue
ISSN
Citations 
14
1367-4803
1
PageRank 
References 
Authors
0.35
5
3
Name
Order
Citations
PageRank
Seung Hwan Hong110.35
Keehyoung Joo252.81
Jooyoung Lee39910.25