Title
Building Highly-Optimized, Low-Latency Pipelines for Genomic Data Analysis.
Abstract
Next-generation sequencing has transformed ge- nomics into a new paradigm of data-intensive computing. The deluge of genomic data needs to undergo deep analysis to mine biological information. Deep analysis pipelines often take days to run, which entails a long cycle for algorithm and method development and hinders future application for clinic use. In this project, we aim to bring big data technology to the genomics domain and innovate in this new domain to revolutionize its data crunching power. Our work includes the development of a deep analysis pipeline, a parallel plat- form for pipeline execution, and a principled approach to optimizing the pipeline. We also present some initial evalua- tion results using existing long-running pipelines at the New York Genome Center, as well as a variety of real use cases that we plan to build in the course of this project.
Year
Venue
Field
2015
CIDR
Data mining,Pipeline transport,Use case,Computer science,Genomics,Latency (engineering),Big data,Database,New York Genome Center
DocType
Citations 
PageRank 
Conference
5
0.51
References 
Authors
12
3
Name
Order
Citations
PageRank
Yanlei Diao12234108.95
Abhishek Roy245132.21
Toby Bloom350.51