Title
Parameterized BLOSUM matrices for protein alignment
Abstract
Protein alignment is a basic step for many molecular biology researches. The BLOSUM matrices, especially BLOSUM62, are the de facto standard matrices for protein alignments. However, after widely utilization of the matrices for 15 years, programming errors were surprisingly found in the initial version of source codes for their generation. And amazingly, after bug correction, the \"intended\" BLOSUM62 matrix performs consistently worse than the \"miscalculated\" one. In this paper, we find linear relationships among the eigenvalues of the matrices and propose an algorithm to find optimal unified eigenvectors. With them, we can parameterize matrix BLOSUMx for any given variable x that could change continuously. We compare the effectiveness of our parameterized isentropic matrix with BLOSUM62. Furthermore, an iterative alignment and matrix selection process is proposed to adaptively find the best parameter and globally align two sequences. Experiments are conducted on aligning 13,667 families of Pfam database and on clustering MHC II protein sequences, whose improved accuracy demonstrates the effectiveness of our proposed method.
Year
DOI
Venue
2015
10.1109/TCBB.2014.2366126
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Keywords
Field
DocType
Matrices,Eigenvalues and eigenfunctions,Databases,Optimization,Entropy
Sequence alignment,De facto standard,Parameterized complexity,Computer science,Matrix (mathematics),Theoretical computer science,BLOSUM,Bioinformatics,Cluster analysis,Substitution matrix,Eigenvalues and eigenvectors
Journal
Volume
Issue
ISSN
12
3
1545-5963
Citations 
PageRank 
References 
4
0.47
16
Authors
8
Name
Order
Citations
PageRank
Dandan Song1327.92
Jiaxing Chen241.15
Guang Chen340.47
Ning Li440.47
Jin Li541.15
Jun Fan6434.90
Dongbo Bu715721.54
Shuai Cheng Li818430.25