Title
Iterative Refinement of Repeat Sequence Specification Using Constrained Pattern Matching
Abstract
Repeated sequences in genome are structures which indicate important biological functions such as protein binding. They are associated with various genetic diseases. We consider the problem of finding a specification for a "significant" repeating pattern in a given sequence. A significant pattern carries high amount of information, and it has many non-overlapping repeats. We propose for this problem, a method that takes as input an initial specification for a repeating pattern. A pattern is specified by a sequence of letters separated by varying length wildcards. The method presents to the user maximal occurrences for the current pattern specification in a way that no text symbol can be shared as a letter by two different pattern occurrences. This reduces the begin-end position-overlaps among different occurrences. The user modifies the specification manually to eliminate overlapping repeats. This process continues until a specification for a significant pattern is obtained.
Year
DOI
Venue
2007
10.1109/BIBE.2007.4375715
BIBE
Keywords
Field
DocType
begin-end position-overlaps,vertex-disjoints path,pattern matching with wildcards,cellular biophysics,diseases,genetics,repeating pattern,iterative refinement,proteins,maximum flow,biological techniques,nonoverlapping repeats,molecular biophysics,edge disjoint path,genome,repeat sequence specification,repeats,protein binding,constrained pattern matching,genetic diseases,current pattern specification,pattern matching
Iterative refinement,Cellular biophysics,Wildcard character,Computer science,Algorithm,Maximum flow problem,Bioinformatics,Pattern matching
Conference
ISBN
Citations 
PageRank 
978-1-4244-1509-0
2
0.38
References 
Authors
4
4
Name
Order
Citations
PageRank
Dan He1140.88
Abdullah N. Arslan219419.71
Yu He320.38
Xindong Wu48830503.63