Title
Finding novel genes in bacterial communities isolated from the environment.
Abstract
Novel sequencing techniques can give access to organisms that are difficult to cultivate using conventional methods. When applied to environmental samples, the data generated has some drawbacks, e.g. short length of assembled contigs, in-frame stop codons and frame shifts. Unfortunately, current gene finders cannot circumvent these difficulties. At the same time, the automated prediction of genes is a prerequisite for the increasing amount of genomic sequences to ensure progress in metagenomics.We introduce a novel gene finding algorithm that incorporates features overcoming the short length of the assembled contigs from environmental data, in-frame stop codons as well as frame shifts contained in bacterial sequences. The results show that by searching for sequence similarities in an environmental sample our algorithm is capable of detecting a high fraction of its gene content, depending on the species composition and the overall size of the sample. The method is valuable for hunting novel unknown genes that may be specific for the habitat where the sample is taken. Finally, we show that our algorithm can even exploit the limited information contained in the short reads generated by 454 technology for the prediction of protein coding genes.The program is freely available upon request.
Year
DOI
Venue
2006
10.1093/bioinformatics/btl247
ISMB (Supplement of Bioinformatics)
Keywords
Field
DocType
short length,environmental sample,bacterial sequence,gene content,frame shift,current gene,novel gene,environmental data,automated prediction,hunting novel unknown gene,bacterial community,genome sequence,gene finding
Sequence alignment,Data mining,Gene,Computer science,Gene prediction,Metagenomics,Coding (social sciences),Contig,Bioinformatics,Stop codon,Environmental data
Conference
Volume
Issue
ISSN
22
14
1367-4811
Citations 
PageRank 
References 
6
2.57
5
Authors
8
Name
Order
Citations
PageRank
Lutz Krause1636.65
Naryttza N Diaz2646.21
Daniela Bartels3286.60
Robert Edwards418618.91
Alfred Pühler5736.19
Forest Rohwer610912.51
Folker Meyer748451.83
Jens Stoye81341134.73