Title
Initial steps towards a production platform for DNA sequence analysis on the grid.
Abstract
BACKGROUND: Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. RESULTS: In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. CONCLUSIONS: The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/
Year
DOI
Venue
2010
10.1186/1471-2105-11-598
BMC Bioinformatics
Keywords
Field
DocType
source code,data storage,dna sequence,computational biology,sequence alignment,graphical interface,dna sequence analysis,proof of concept,workflow,data transfer,next generation sequencing,high throughput
Workflow technology,Computer science,Computer data storage,Server,Grid energy storage,Bioinformatics,Throughput,Workflow,Grid,Virtual organization
Journal
Volume
Issue
ISSN
11
1
1471-2105
Citations 
PageRank 
References 
12
0.42
19
Authors
6
Name
Order
Citations
PageRank
A. C. M. Luyf1151.75
Barbera van Schaik2212.28
Michel de Vries3120.42
F Baas4171.86
Antoine H. C. van Kampen5628.90
Sílvia D. Olabarriaga6636.56