Title
Maximal Motif Discovery in a Sliding Window.
Abstract
Motifs are relatively short sequences that are biologically significant, and their discovery in molecular sequences is a well-researched subject. A don't care is a special letter that matches every letter in the alphabet. Formally, a motif is a sequence of letters of the alphabet and don't care letters. A motif (m) over tilde (d,k) that occurs at least k times in a sequence is maximal if it cannot be extended (to the left or right) nor can it be specialised (that is, its d' <= d don't cares cannot be replaced with letters from the alphabet) without reducing its number of occurrences. Here we present a new dynamic data structure, and the first on-line algorithm, to discover all maximal motifs in a sliding window of length l on a sequence x of length n in O (ndl + dinverted right perpendicularl/winverted left perpendicular . Sigma(n-1)(i=l)vertical bar DIFFi-1i vertical bar) time, where w is the size of the machine word and DIFFi-1i is the symmetric difference of the sets of occurrences of maximal motifs at x[i - l .. i - 1] and at x[i - l + 1 .. i].
Year
DOI
Venue
2018
10.1007/978-3-030-00479-8_16
Lecture Notes in Computer Science
Keywords
Field
DocType
Motif discovery,Sequence motifs,Genome analysis
Symmetric difference,Combinatorics,Sliding window protocol,Computer science,Dynamic data structures,Alphabet
Conference
Volume
ISSN
Citations 
11147
0302-9743
0
PageRank 
References 
Authors
0.34
8
4
Name
Order
Citations
PageRank
Costas S. Iliopoulos11534167.43
Manal Mohamed210212.62
Solon P. Pissis328157.09
Fatima Vayani4253.85