Abstract | ||
---|---|---|
Next-generation sequencing has transformed ge- nomics into a new paradigm of data-intensive computing. The deluge of genomic data needs to undergo deep analysis to mine biological information. Deep analysis pipelines often take days to run, which entails a long cycle for algorithm and method development and hinders future application for clinic use. In this project, we aim to bring big data technology to the genomics domain and innovate in this new domain to revolutionize its data crunching power. Our work includes the development of a deep analysis pipeline, a parallel plat- form for pipeline execution, and a principled approach to optimizing the pipeline. We also present some initial evalua- tion results using existing long-running pipelines at the New York Genome Center, as well as a variety of real use cases that we plan to build in the course of this project. |
Year | Venue | Field |
---|---|---|
2015 | CIDR | Data mining,Pipeline transport,Use case,Computer science,Genomics,Latency (engineering),Big data,Database,New York Genome Center |
DocType | Citations | PageRank |
Conference | 5 | 0.51 |
References | Authors | |
12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanlei Diao | 1 | 2234 | 108.95 |
Abhishek Roy | 2 | 451 | 32.21 |
Toby Bloom | 3 | 5 | 0.51 |