Title
Towards automatic detecting of overlapping genes - clustered BLAST analysis of viral genomes
Abstract
Overlapping genes (encoded on the same DNA locus but in different frames) are thought to be rare and, therefore, were largely neglected in the past. In a test set of 800 viruses we found more than 350 potential overlapping open reading frames of 500 bp which generate BLAST hits, indicating a possible biological function. Interestingly, five overlaps with more than 2000 bp were found, the largest may even contain triple overlaps. In order to perform the vast amount of BLAST searches required to test all detected open reading frames, we compared two clustering strategies (BLASTCLUST and k-means) and queried the database with one representative only. Our results show that this approach achieves a significant speed-up while retaining a high quality of the results (99% precision compared to single queries) for both clustering methods. Future wet lab experiments are needed to show whether the detected overlapping reading frames are biologically functional.
Year
DOI
Venue
2010
10.1007/978-3-642-12211-8_20
EvoBIO
Keywords
Field
DocType
blast analysis,clustering strategy,open reading frame,dna locus,potential overlapping open reading,clustering method,overlapping gene,viral genomes,different frame,overlapping reading frame,future wet lab experiment,blast hit,k means,gene cluster,clustering
Gene,Computer science,Viral genomes,Open reading frame,Bioinformatics,Cluster analysis,Locus (genetics),Test set,Reading frame
Conference
Volume
ISSN
ISBN
6023
0302-9743
3-642-12210-8
Citations 
PageRank 
References 
0
0.34
7
Authors
5
Name
Order
Citations
PageRank
Klaus Neuhaus1181.72
Daniela Oelke222513.18
David Fürst300.34
Siegfried Scherer400.34
Daniel A. Keim577041141.60