Title
RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families
Abstract
High-quality multiple sequence alignments can provide insights into the architecture and function of protein families. The existing MSA tools often generate results inconsistent with biological distribution of conserved regions because of positioning amino acid residues and gaps only by symbols. We propose RPfam, a refiner towards curated-like MSAs for modeling the protein families in the Pfam database. RPfam refines the automatic alignments via scoring alignments based on the PFASUM matrix, restricting realignments within badly aligned blocks, optimizing the block scores by dynamic programming, and running refinements iteratively using the Simulated Annealing algorithm. Experiments show RPfam effectively refined the alignments produced by the MSA tools ClustalO and Muscle with reference to the curated seed alignments of the Pfam protein families. Especially RPfam improved the quality of the ClustalO alignments by 4.4% and the Muscle alignments by 2.8% on the gp32 DNA binding protein-like family. Supplementary Table is available at http://www.worldscinet.com/jbcb/.
Year
DOI
Venue
2022
10.1142/S0219720022400029
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY
Keywords
DocType
Volume
Refinement, MSA, Pfam protein families
Journal
20
Issue
ISSN
Citations 
04
0219-7200
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Qingting Wei100.68
Hong Zou200.34
Cuncong Zhong300.68
Jianfeng Xu401.35