Title
Enrichment of oligonucleotide sets with transcription control signals. III: DNA from non-mammalian vertebrates.
Abstract
We studied the frequency distribution of 1,048,576 oligonucleotides 10 bp long in a sample of 1.072 x 10(6) bases of genes from non-mammalian vertebrates, made of 322 sequences extracted from EMBL(R) 29.0, with the aim of detecting transcription control signals. Among all decamers, 2097 (0.2%) had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 2097 decamers (parents), we counted the individual frequencies of the 30 decamers differing from the parent by one base mutation (progeny) and we calculated two variance/mean chi squares for the progeny, with and without the parent decamer. By studying the distribution of the ratio between the two chi squares we observed that out of 2097 decamers that occurred > 10 times more frequently than average, 1017 had a chi square ratio of between 1 and 1.5; in this final set, which corresponds to < 0.097% of all possible decamers, 75 decamers were found to contain 100 transcription control elements, like CCAAT and others. The final set contains a high excess of signals when compared to 100 random sets of 1017 decamers. Some of the decamers selected with the procedure are members of consensus sequences rather than unique sequences.
Year
DOI
Venue
1993
10.1093/bioinformatics/9.6.647
Computer Applications in the Biosciences
Field
DocType
Volume
Gene,Transcription (biology),Biology,DNA,Oligonucleotide,Bioinformatics,Genetics,Consensus sequence,Statistical analysis
Journal
9
Issue
ISSN
Citations 
6
0266-7061
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
C Scapoli101.69
A Rodríguez-Larralde200.68
Stefano Volinia39418.64
I Barrai459.96