Title
Efficient discovery of structural motifs from protein sequences with combination of flexible intra- and inter-block gap constraints
Abstract
Discovering protein structural signatures directly from their primary information is a challenging task, because the residues associated with a functional motif are not necessarily clustered in one region of the sequence. This work proposes an algorithm that aims to discover conserved sequential blocks interleaved by large irregular gaps from a set of unaligned biological sequences. Different from the previous works that employ only one type of constraint on gap flexibility, we propose using combination of intra- and inter-block gap constraints to discover longer patterns with larger irregular gaps. The smaller flexible intra-block gap constraint is used to relax the restriction in local motif blocks but still keep them compact, and the larger flexible inter-block gap constraint is proposed to allow longer irregular gaps between compact motif blocks. Using two types of gap constraints for different purposes improves the efficiency of mining process while keeping high accuracy of mining results. The efficiency of the algorithm also helps to identify functional motifs that are conserved in only a small subset of the input sequences.
Year
DOI
Venue
2006
10.1007/11731139_62
PAKDD
Keywords
Field
DocType
inter-block gap constraint,larger irregular gap,compact motif block,irregular gap,gap flexibility,gap constraint,smaller flexible intra-block gap,efficient discovery,protein sequence,larger flexible inter-block gap,large irregular gap,functional motif,structural motif,protein structure
Data mining,Computer science,Motif (music),Knowledge extraction,Structural motif
Conference
Volume
ISSN
ISBN
3918
0302-9743
3-540-33206-5
Citations 
PageRank 
References 
9
0.62
13
Authors
4
Name
Order
Citations
PageRank
Chen-Ming Hsu1775.77
Chien-Yu Chen236729.24
Ching-Chi Hsu332546.13
Baw-Jhiune Liu419338.12