Title
Poster: Identification and classification of internal repeats in proteins
Abstract
Internal repeats are widely found in proteins and considered to be important in protein evolution and function. Three major types of internal repeat including domain, solenoid, and fibrous repeats are shown in Figure 1. These repeats may involve in protein-protein interaction as well as binding to various ligands such as DNA and RNA. For example, the tetratrico peptide repeats (TPR) that are involved in cell-cycle regulation, transcriptional regulation, protein transport, and assisting protein folding [1][2], and the TATA-binding protein (TBP) is a transcription factor that binds specifically to a DNA sequence [3]. To identify and classify various types of protein repeats with different lengths from a query protein sequence or structure, we have designed a comprehensive system which focuses on analyzing autocorrelation relationships of sequence contents and topology of secondary structures within a protein. A complete database containing verified fundamental repeat sequence peptides and structural units for homologous matching analysis is also constructed. The data flow diagram of the proposed identification system is shown in Figure 2, which contains two major parts: Repeat Database and Internal Repeat Analyzer. The Repeat Database is constructed by evaluating proteins from SCOP and Pfam through an autocorrelation mechanism. The Internal Repeat Analyzer is designed as a three-level hierarchical analysis for detecting domain, solenoid, and fibrous repeat respectively. In addition, an iteratively refined multiple structure alignment tool has been developed for comparing and verifying those extracted internal repeat substructures. In this study, the collected database contains 162 domain families with repeat chatacteristics, 28 fundamental repeat structure units and 129 repeat subsequences retrieved from 1,961 superfamilies, and we have demonstrated the proposed system can efficiently identify repeat topologies of proteins.
Year
DOI
Venue
2011
10.1109/ICCABS.2011.5729920
ICCABS
Keywords
Field
DocType
tetratrico peptide repeat,internal repeat substructure,repeat database,tata-binding protein,repeat topology,fibrous repeat,fundamental repeat sequence peptides,repeat chatacteristics,fundamental repeat structure unit,internal repeat,correlation,topology,cell cycle regulation,ligand binding,protein folding,databases,biochemistry,database management systems,solenoids,dna,scop,transcription factor,molecular biophysics,proteomics,rna,bioinformatics,protein protein interaction,transcriptional regulation,autocorrelation,repeat unit,proteins,dna sequence,protein transport
Protein folding,Repeat unit,Structural alignment,Protein–protein interaction,Protein sequencing,Biology,Pentapeptide repeat,TATA-binding protein,Direct repeat,Bioinformatics,Genetics
Conference
Citations 
PageRank 
References 
0
0.34
1
Authors
4
Name
Order
Citations
PageRank
Cing-Han Yang101.35
Hsin-Wei Wang2426.56
Tsan-Huang Shih321.82
Tun-Wen Pai412729.71