Abstract | ||
---|---|---|
With the advent of high-throughput DNA sequencing technology, the analysis and management of the increasing amount of biological sequence data has become a bottleneck for scientific progress. For example, MG-RAST, a metagenome annotation system serving a large scientific community worldwide, has experienced a sustained, exponential growth in data submissions for several years; and this trend is expected to continue. To address the computational challenges posed by this workload, we developed a new data analysis platform, including a data management system (Shock) for biological sequence data and a workflow management system (AWE) supporting scalable, fault-tolerant task and resource management. Shock and AWE can be used to build a scalable and reproducible data analysis infrastructure for upper-level biological data analysis services. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/BigData.2013.6691723 | BigData Conference |
Keywords | Field | DocType |
metagenomics,workflow,workflow management system,shock,data submissions,genomics,mg-rast,data analysis,scientific progress,upper-level biological data analysis services,biology computing,data analysis platform,biological sequence data,high-throughput dna sequencing technology,awe,cloud computing,dna,bioinformatics,scalable data analysis platform,data management system,metagenome annotation system | Resource management,Data science,Data mining,Biological data,Bottleneck,Computer science,Data management,Workflow management system,Workflow,Scalability,Cloud computing | Conference |
ISSN | Citations | PageRank |
2639-1589 | 16 | 1.06 |
References | Authors | |
12 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wei Tang | 1 | 44 | 2.48 |
Jared Wilkening | 2 | 48 | 3.77 |
Narayan Desai | 3 | 319 | 29.73 |
Wolfgang Gerlach | 4 | 81 | 7.03 |
Andreas Wilke | 5 | 314 | 23.84 |
Folker Meyer | 6 | 484 | 51.83 |