Title
Bayesian classifiers for detecting HGT using fixed and variable order markov models of genomic signatures.
Abstract
Analyses of genomic signatures are gaining attention as they allow studies of species-specific relationships without involving alignments of homologous sequences. A naïve Bayesian classifier was built to discriminate between different bacterial compositions of short oligomers, also known as DNA words. The classifier has proven successful in identifying foreign genes in Neisseria meningitis. In this study we extend the classifier approach using either a fixed higher order Markov model (Mk) or a variable length Markov model (VLMk).We propose a simple algorithm to lock a variable length Markov model to a certain number of parameters and show that the use of Markov models greatly increases the flexibility and accuracy in prediction to that of a naïve model. We also test the integrity of classifiers in terms of false-negatives and give estimates of the minimal sizes of training data. We end the report by proposing a method to reject a false hypothesis of horizontal gene transfer.Software and Supplementary information available at www.cs.chalmers.se/~dalevi/genetic_sign_classifiers/.
Year
DOI
Venue
2006
10.1093/bioinformatics/btk029
Bioinformatics
Keywords
Field
DocType
bayesian classifier,variable order markov model,neisseria meningitis,markov model,variable length markov model,higher order markov model,dna word,classifier approach,different bacterial composition,genomic signature,supplementary information,certain number,horizontal gene transfer,higher order
Maximum-entropy Markov model,Markov model,Computer science,Software,Variable-order Markov model,Artificial intelligence,SIMPLE algorithm,Classifier (linguistics),Machine learning,Naive bayesian classifier,Bayesian probability
Journal
Volume
Issue
ISSN
22
5
1367-4803
Citations 
PageRank 
References 
4
0.57
4
Authors
3
Name
Order
Citations
PageRank
Daniel Dalevi113112.89
Devdatt Dubhashi241429.91
Malte Hermansson340.57