Abstract | ||
---|---|---|
Motifs are relatively short sequences that are biologically significant, and their discovery in molecular sequences is a well-researched subject. A don't care is a special letter that matches every letter in the alphabet. Formally, a motif is a sequence of letters of the alphabet and don't care letters. A motif (m) over tilde (d,k) that occurs at least k times in a sequence is maximal if it cannot be extended (to the left or right) nor can it be specialised (that is, its d' <= d don't cares cannot be replaced with letters from the alphabet) without reducing its number of occurrences. Here we present a new dynamic data structure, and the first on-line algorithm, to discover all maximal motifs in a sliding window of length l on a sequence x of length n in O (ndl + dinverted right perpendicularl/winverted left perpendicular . Sigma(n-1)(i=l)vertical bar DIFFi-1i vertical bar) time, where w is the size of the machine word and DIFFi-1i is the symmetric difference of the sets of occurrences of maximal motifs at x[i - l .. i - 1] and at x[i - l + 1 .. i]. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/978-3-030-00479-8_16 | Lecture Notes in Computer Science |
Keywords | Field | DocType |
Motif discovery,Sequence motifs,Genome analysis | Symmetric difference,Combinatorics,Sliding window protocol,Computer science,Dynamic data structures,Alphabet | Conference |
Volume | ISSN | Citations |
11147 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 8 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Costas S. Iliopoulos | 1 | 1534 | 167.43 |
Manal Mohamed | 2 | 102 | 12.62 |
Solon P. Pissis | 3 | 281 | 57.09 |
Fatima Vayani | 4 | 25 | 3.85 |