Title
Deterministic local alignment methods improved by a simple genetic algorithm
Abstract
Multiple sequence local alignment, often deployed for de novo discovery of biological motifs hidden in a set of DNA or protein sequences, remains a challenge in bioinformatics and computational biology. Many algorithms and software packages have been developed to address the problem. Expectation maximization (EM), one of the popular local alignment methods, is often used to solve the motif-finding problem. However, EM largely depends on its initialization and can be easily trapped in local optima. This paper presents the Genetic-enabled EM Motif-Finding Algorithm (GEMFA) in an effort to mitigate the difficulties confronted the EM-based motif discovery algorithms. The new algorithm integrates a simple genetic algorithm (GA) with a local searcher to explore the local alignment space, that is, it combines deterministic local alignment methods with a simple GA to effectively perform de novo motif discovery. It first initializes a population of multiple local alignments each of which is encoded on a chromosome that represents a potential solution. GEMFA then performs heuristic search in the whole alignment space using minimum distance length (MDL) as the fitness function, which is generalized from maximum log-likelihood. The genetic algorithm gradually moves this population towards the best alignment from which the motif model is derived. Simulated and real biological sequence analysis showed that GEMFA significantly improved deterministic local alignment methods especially in the subtle motif sequence alignment, and it also outperformed other algorithms tested.
Year
DOI
Venue
2010
10.1016/j.neucom.2010.01.023
Neurocomputing
Keywords
Field
DocType
best alignment,local searcher,local optimum,simple genetic algorithm,deterministic local alignment method,motif discovery,subtle motif sequence alignment,local alignment space,multiple local alignment,whole alignment space,expectation maximization (em),multiple sequence local alignment,memetic algorithms,local alignment,genetic algorithms (ga),popular local alignment method,expectation maximization,fitness function,heuristic search,computational biology,protein sequence,genetics,genetic algorithm,sequence alignment,memetic algorithm,sequence analysis
Memetic algorithm,Population,Alignment-free sequence analysis,Pattern recognition,Local optimum,Fitness function,Artificial intelligence,Smith–Waterman algorithm,Multiple sequence alignment,Mathematics,Machine learning,Genetic algorithm
Journal
Volume
Issue
ISSN
73
13-15
Neurocomputing
Citations 
PageRank 
References 
4
0.43
57
Authors
1
Name
Order
Citations
PageRank
Chengpeng Bi113111.29