Title
A Genetic-Based EM Motif-Finding Algorithm for Biological Sequence Analysis
Abstract
Motif-finding in biological sequence analysis remains a challenge in computational biology. Many algorithms and software packages have been developed to address the problem. The expectation maximization (EM)-type motif algorithm such as MEME is one of the most popular de novo motif discovery methods. However, as pointed out in literature, EM algorithms largely depend on their initialization and can be easily trapped in local optima. This paper proposes and implements a genetic-based EM motif-finding algorithm (GEMFA) aiming to overcome the drawbacks inherent in EM motif discovery algorithms. It first initializes a population of multiple local alignments each of which is encoded on a chromosome that represents a potential solution. GEMFA then performs heuristic search in the whole alignment space using minimum distance length (MDL) as the fitness function which is generalized from maximum log-likelihood. The genetic algorithm gradually moves this population towards the best alignment from which the motif model is derived. Simulated and real biological sequence analysis showed that GEMFA performed better than the simple multiple-restart of EM motif-finding algorithm especially in the subtle motif sequence alignment and other similar algorithms as well
Year
DOI
Venue
2007
10.1109/CIBCB.2007.4221233
CIBCB
Keywords
Field
DocType
fitness function,expectation-maximisation algorithm,maximum log-likelihood,em motif discovery,biology computing,expectation maximization-type motif algorithm,genetic-based em motif-finding algorithm,biological sequence analysis,genetic algorithm,minimum distance length,genetic algorithms,data mining,computational biology,sequence alignment,heuristic search,sequences,expectation maximization,sequence analysis,algorithm design and analysis,em algorithm,local alignment
Population,Computer science,Artificial intelligence,Genetic algorithm,Sequence alignment,Heuristic,Algorithm design,Expectation–maximization algorithm,Local optimum,Algorithm,Fitness function,Bioinformatics,Machine learning
Conference
ISBN
Citations 
PageRank 
1-4244-0710-9
11
0.58
References 
Authors
8
1
Name
Order
Citations
PageRank
Chengpeng Bi113111.29