Title
Vector quantization kernels for the classification of protein sequences and structures.
Abstract
We propose a new kernel-based method for the classification of protein sequences and structures. We first represent each protein as a set of time series data using several structural, physicochemical, and predicted properties such as a sequence of consecutive dihedral angles, hydrophobicity indices, or predictions of disordered regions. A kernel function is then computed for pairs of proteins, exploiting the principles of vector quantization and subsequently used with support vector machines for protein classification. Although our method requires a significant pre-processing step, it is fast in the training and prediction stages owing to the linear complexity of kernel computation with the length of protein sequences. We evaluate our approach on two protein classification tasks involving the prediction of SCOP structural classes and catalytic activity according to the Gene Ontology. We provide evidence that the method is competitive when compared to string kernels, and useful for a range of protein classification tasks. Furthermore, the applicability of our approach extends beyond computational biology to any classification of time series data.
Year
Venue
Keywords
2014
Biocomputing-Pacific Symposium on Biocomputing
Protein classification,protein structure,protein function,kernels,vector quantization,support vector machines
Field
DocType
ISSN
Linde–Buzo–Gray algorithm,Biology,Pattern recognition,Vector quantization,Artificial intelligence
Conference
2335-6936
Citations 
PageRank 
References 
3
0.39
3
Authors
2
Name
Order
Citations
PageRank
Wyatt T Clark130.39
Predrag Radivojac251.52