Title
Searching for ncRNAs in eukaryotic genomes: Maximizing biological input with RNAmotif
Abstract
Summary Non-coding RNAs (ncRNAs) contain both characteristic secondary-structure and short sequence motifs. However, "complex" ncRNAs (RNA bound to proteins in ribonucleoprotein complexes) can be hard to identify in genomic sequence data. Programs able to search for ncRNAs were previously limited to ncRNA molecules that either align very well or have highly conserved secondary-structure. The RNAmotif program uses additional information to find ncRNA gene candidates through the design of an appropriate "descriptor" to model sequence motifs, secondary-structure and protein/RNA binding information. This enables searches of those ncRNAs that contain variable secondary-structure and limited sequence motif information. Applying the biologically-based concept of "positive and negative controls" to the RNAmotif search technique, we can now go beyond the testing phase to successfully search real genomes, complete with their background noise and related molecules. Descriptors are designed for two "complex" ncRNAs, the U5snRNA (from the spliceosome) and RNaseP RNA, which successfully uncover these sequences from some eukaryotic genomes. We include explanations about the construction of the input "descriptors" from known biological information, to allow searches for other ncRNAs. RNAmotif maximizes the input of biological knowledge into a search for an ncRNA gene and now allows the investigation of some of the hardest-to-find, yet important, genes in some very interesting eukaryotic organisms.
Year
DOI
Venue
2004
10.2390/biecoll-jib-2004-6
J. Integrative Bioinformatics
Keywords
Field
DocType
negative control,secondary structure,sequence motif,genome sequence,non coding rna
Genome,RNA,Gene,Computer science,Sequence motif,Data sequences,Bioinformatics,Spliceosome,Non-coding RNA,Ribonucleoprotein
Journal
Volume
Issue
Citations 
1
1
3
PageRank 
References 
Authors
0.51
4
4
Name
Order
Citations
PageRank
lesley j collins130.51
Thomas J. Macke22012.00
David Penny37114.77
allan c wilson430.85