Title | ||
---|---|---|
Vector quantization kernels for the classification of protein sequences and structures. |
Abstract | ||
---|---|---|
We propose a new kernel-based method for the classification of protein sequences and structures. We first represent each protein as a set of time series data using several structural, physicochemical, and predicted properties such as a sequence of consecutive dihedral angles, hydrophobicity indices, or predictions of disordered regions. A kernel function is then computed for pairs of proteins, exploiting the principles of vector quantization and subsequently used with support vector machines for protein classification. Although our method requires a significant pre-processing step, it is fast in the training and prediction stages owing to the linear complexity of kernel computation with the length of protein sequences. We evaluate our approach on two protein classification tasks involving the prediction of SCOP structural classes and catalytic activity according to the Gene Ontology. We provide evidence that the method is competitive when compared to string kernels, and useful for a range of protein classification tasks. Furthermore, the applicability of our approach extends beyond computational biology to any classification of time series data. |
Year | Venue | Keywords |
---|---|---|
2014 | Biocomputing-Pacific Symposium on Biocomputing | Protein classification,protein structure,protein function,kernels,vector quantization,support vector machines |
Field | DocType | ISSN |
Linde–Buzo–Gray algorithm,Biology,Pattern recognition,Vector quantization,Artificial intelligence | Conference | 2335-6936 |
Citations | PageRank | References |
3 | 0.39 | 3 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wyatt T Clark | 1 | 3 | 0.39 |
Predrag Radivojac | 2 | 5 | 1.52 |