Title
Generic Spaced Dna Motif Discovery Using Genetic Algorithm
Abstract
DNA motif discovery is an important problem for deciphering gene regulation. Motifs usually contain gaps (spaced) and are more complex than contiguously conserved (monad) patterns. Existing algorithms mostly address monad motifs, and methods for spaced motifs impose various constraints on gaps, which may affect the discovery of complex motifs. In this paper, we propose Genetic Algorithm (GA) for Spaced Motifs Elicitation on Nucleotides (GASMEN), which searches from a wide range of possible widths (4-25) and relaxes substantial constraints. GASMEN employs submotif indexing to partition the search space into smaller sub-space for GA to easier reach optimality. Multiple-motif control is employed and probabilistic refinements are proposed to improve motif quality respectively. The preliminary results on real spaced motifs demonstrate that GASMEN is promising to find more accurate motifs and optimal widths, compared with the state-of-the-art method, SPACE. GASMEN is also capable of finding monad motifs, outperforming both Weeder and SPACE on most of the 8 real datasets.
Year
DOI
Venue
2010
10.1109/CEC.2010.5585924
2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC)
Keywords
Field
DocType
hamming distance,genetic algorithm,pulse width modulation,nucleotides,gene regulation,indexation,dna,probabilistic logic,search space,genetic algorithms,genetics
Computer science,Sequence motif,Search engine indexing,Theoretical computer science,Hamming distance,Artificial intelligence,Aerospace electronics,Probabilistic logic,Partition (number theory),Machine learning,Genetic algorithm,Monad (functional programming)
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Tak-ming Chan119013.57
Kwong-Sak Leung21887205.58
Kin-Hong Lee325726.27
Pietro Liò455099.98