Title
Big Data: Cloud Computing In Genomics Applications
Abstract
Healthcare applications typically require big data management as well as intensive computation. This is especially true with recently developed next generation sequencing technology which increases interests in processing the huge amount of information in a timely fashion. In this paper, we focus on testing whether the healthcare applications can scale well on commercial big data platforms that implement MapReduce framework. We selected short read sequence alignment and assembly workloads in genome analysis workloads, and chose Bowtie, Blast and Contrail-bio which are publically available applications designed to run on the Hadoop MapReduce framework. To speed-up the processes we compressed the intermediate data using various compression schemes the compression schemes are compared. The test results are very promising and indicate that the wide range of genomic analysis workflows can be optimized on MapReduce frameworks with great computational efficiency and scalability.
Year
Venue
Keywords
2015
PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA
component, hadoop, mapreduce, genomic workflow, bioinformatics
DocType
Citations 
PageRank 
Conference
1
0.41
References 
Authors
2
2
Name
Order
Citations
PageRank
Hangu Yeo1306.74
Catherine H. Crawford2285.25