Title
Tree Pruner: An efficient tool for selecting data from a biased genetic database.
Abstract
Large databases of genetic data are often biased in their representation. Thus, selection of genetic data with desired properties, such as evolutionary representation or shared genotypes, is problematic. Selection on the basis of epidemiological variables may not achieve the desired properties. Available automated approaches to the selection of influenza genetic data make a tradeoff between speed and simplicity on the one hand and control over quality and contents of the dataset on the other hand. A poorly chosen dataset may be detrimental to subsequent analyses.We developed a tool, Tree Pruner, for obtaining a dataset with desired evolutionary properties from a large, biased genetic database. Tree Pruner provides the user with an interactive phylogenetic tree as a means of editing the initial dataset from which the tree was inferred. The tree visualization changes dynamically, using colors and shading, reflecting Tree Pruner actions. At the end of a Tree Pruner session, the editing actions are implemented in the dataset. Currently, Tree Pruner is implemented on the Influenza Research Database (IRD). The data management capabilities of the IRD allow the user to store a pruned dataset for additional pruning or for subsequent analysis. Tree Pruner can be easily adapted for use with other organisms.Tree Pruner is an efficient, manual tool for selecting a high-quality dataset with desired evolutionary properties from a biased database of genetic sequences. It offers an important alternative to automated approaches to the same goal, by providing the user with a dynamic, visual guide to the ongoing selection process and ultimate control over the contents (and therefore quality) of the dataset.
Year
DOI
Venue
2011
10.1186/1471-2105-12-51
BMC Bioinformatics
Keywords
Field
DocType
computational biology,data management,phylogenetic tree,bioinformatics,database management systems,data mining,genetics,algorithms,selection bias,phylogeny,microarrays
Data mining,Computer science,Software,Bioinformatics,Selection bias,Database
Journal
Volume
Issue
ISSN
12
1
1471-2105
Citations 
PageRank 
References 
9
0.38
4
Authors
6
Name
Order
Citations
PageRank
Mohan Krishnamoorthy1109168.89
Pragneshkumar Patel2151.29
Mira Dimitrijevic3110.84
Jonathan Dietrich4241.24
Margaret Green590.38
Catherine Macken6233.17