Abstract | ||
---|---|---|
Accurate splice site prediction is a critical component of any computational approach to gene prediction in higher organisms. Existing approaches generally use sequence-based models that capture local dependencies among nucleotides in a small window around the splice site. We present evidence that computationally predicted secondary structure of moderate length pre- mRNA subsequences contains information that can be exploited to improve acceptor splice site prediction beyond that possible with conventional sequence-based approaches. Both decision tree and support vector machine classifiers, using folding energy and structure metrics char- acterizing helix formation near the splice site, achieve a 5-10% reduction in error rate with a human data set. Based on our data, we hypothesize that acceptors preferentially exhibit short helices at the splice site. |
Year | Venue | Keywords |
---|---|---|
2002 | Pacific Symposium on Biocomputing | error rate,secondary structure,decision tree,nucleotides,support vector machine,gene prediction |
Field | DocType | ISSN |
Decision tree,Precursor mRNA,Computer science,Support vector machine,Word error rate,Gene prediction,Helix,RNA splicing,Bioinformatics,Protein secondary structure | Conference | 2335-6936 |
Citations | PageRank | References |
13 | 1.16 | 4 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Donald J. Patterson | 1 | 1765 | 219.99 |
Ken Yasuhara | 2 | 47 | 5.89 |
Walter L. Ruzzo | 3 | 2727 | 550.25 |