Abstract | ||
---|---|---|
Motivation: Motif identification for sequences has many important applications in biological studies, e.g., diagnostic probe design,
locating binding sites and regulatory signals, and potential drug target identification. There are two versions.
1
Single Group: Given a group of n sequences, find a length-l motif that appears in each of the given sequences and those occurrences of the motif are similar.
1
Two Groups: Given two groups of sequences B and G, find a length-l (distinguishing) motif that appears in every sequence in B and does not appear in anywhere of the sequences in G.
Here the occurrences of the motif in the given sequences have errors. Currently, most of existing programs can only handle
the case of single group. Moreover, it is very difficult to use edit distance (allowing indels and replacements) for motif
detection.
Results: (1) We propose a randomized algorithm for the one group problem that can handle indels in the occurrences of the motif. (2)
We give an algorithm for the two groups problem. (3) Extensive simulations have been done to evaluate the algorithms.
|
Year | DOI | Venue |
---|---|---|
2007 | 10.1007/978-3-540-73437-6_26 | Combinatorial Pattern Matching |
Keywords | Field | DocType |
binding site,motif identification,distinguishing motif,em algorithms,potential drug target identification,single group,length-l motif,and two groups.,motif detection,randomized algorithm,sequences b,groups problem,group problem,drug targeting,edit distance | Randomized algorithm,Computer science,Motif (music),Drug target,DNA sequencing,Bioinformatics,Indel | Conference |
Volume | ISSN | ISBN |
4580 | 0302-9743 | 3-540-73436-8 |
Citations | PageRank | References |
2 | 0.37 | 18 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wangsen Feng | 1 | 17 | 3.29 |
Zhanyong Wang | 2 | 50 | 7.04 |
Lusheng Wang | 3 | 2433 | 224.97 |