Title
Minimum description length based protein secondary structure prediction
Abstract
This paper introduces a new algorithm for predicting the secondary structure of a protein based on the protein's pri- mary structure, i.e. its amino acid sequence. The problem consists in finding the segmentation of the initial amino acid sequence, where each segment carries the label of a sec- ondary structure, e.g., helix, strand, and coil. Our algorithm is different from other existing probabilistic inference algo- rithms in that it uses probabilistic models suitable for directly encoding the joint information represented by the pair (amino acid sequence, secondary structure labels), and chooses as winner the secondary structure sequence providing the min- imum representation, or description length, in line with the minimum description length principle. An additional benefit of our approach is that we provide not only a secondary struc- ture prediction tool, but also a tool that is able to compress in an efficient manner the joint sequences that define the pri- mary and secondary structure information in proteins. The preliminary results obtained for prediction and compression show a good performance, which is better in certain aspects than that of comparable algorithms.
Year
Venue
Keywords
2008
EUSIPCO
image representation,image segmentation,image sequences,proteins,amino acid sequence,minimum description length,minimum representation,probabilistic inference algorithms,protein primary structure,protein secondary structure prediction,secondary structure labels,secondary structure prediction tool,secondary structure sequence,sequence segmentation
DocType
ISSN
Citations 
Conference
2219-5491
0
PageRank 
References 
Authors
0.34
4
2
Name
Order
Citations
PageRank
Andrea Hategan151.57
Ioan Tabus227638.23