Title
PhyloFlow: A fully customizable and automatic workflow for phylogenetic reconstruction
Abstract
Most phylogeny estimation systems such as SATe2 or DACTAL use fixed configurations and tools that make them suitable only for solving specific problems. Out of that scope, a hand-made combination of individual tools and methods has to be composed in order to get the desired phylogeny estimation. PhyloFlow is a new framework based on a workflow extendable to a wide range of tasks in phylogenetic analysis. This system is specially intended to build large phylogenies, where most of the methods do not provide a solution at all or the computing time required is not affordable. The workflow can scale to different phylogenetic estimation problems, the methods and stages already included can be fully customizable and once the user has set up the system, it will run automatically until the phylogenetic tree is completely estimated. With the current version we have recreated two different phy-logenetic systems: DACTAL and a study case for the human mitochondrial DNA. The first one displays the capabilities of our framework to reproduce the existing systems, in addition with the properties that a parallel system can provide. The second one shows the possibilities of building a real case workflow to estimate a phylogenetic tree for more than 23000 sequences of human mitochondrial DNA (16569 bp on average) applying biological knowledge to the process. Both workflows have been run sequentially and in parallel in a HTC cluster (HTCCondor and DAGMan). PhyloFlow source code, the datasets and the workflow configurations are available by request to the first author.
Year
DOI
Venue
2014
10.1109/BIBM.2014.6999303
Bioinformatics and Biomedicine
Keywords
Field
DocType
DNA,biology computing,evolution (biological),genetics,molecular biophysics,source code (software),DACTAL,HTC cluster,PhyloFlow source code,biological system,human mitochondrial DNA,parallel system,phylogenetic analysis,phylogenetic reconstruction,phylogenetic tree,phylogeny estimation systems,maximum likelihood,model selection,phylogeny estimation,scientific workflow,supertree
Phylogenetic tree,Source code,Computer science,Phylogenetic reconstruction,Theoretical computer science,Bioinformatics,Phylogenetics,Workflow
Conference
ISSN
Citations 
PageRank 
2156-1125
1
0.37
References 
Authors
9
3
Name
Order
Citations
PageRank
Jorge Álvarez-Jarreta110.37
Gregorio de Miguel Casado2245.69
Elvira Mayordomo350039.46