Title
Workload characterization for MG-RAST metagenomic data analytics service in the cloud
Abstract
The cost of DNA sequencing has plummeted in recent years. The consequent data deluge has imposed big burdens for data analysis applications. For example, MG-RAST, a production open-public metagenome annotation service, has experienced increasingly large amount of data submission and has demanded scalable resources for the computational needs. To address this problem, we have developed a scalable platform to port MG-RAST workloads into the cloud, where elastic computing resources can be used on demand. To efficiently utilize such resources, however, one must understand the characteristics of the application workloads. In this paper, we characterize the MG-RAST workloads running in the cloud, from the perspectives of computation, I/O, and data transfer. Insights from this work will help guide application enhancement, service operation, and resource management for MG-RAST and similar big data applications demanding elastic computing resources.
Year
DOI
Venue
2014
10.1109/BigData.2014.7004394
BigData Conference
Keywords
Field
DocType
elastic cloud resources,production open-public metagenome annotation service,workload characterization,elastic computing resources,genomics,data analysis,big data applications,mg-rast metagenomic data analytics service,big data analysis,data analytics as a service,big data,cloud computing,bioinformatics,data transfer
Data science,Resource management,Data mining,Data analysis,Workload,Computer science,Utility computing,Analytics,Big data,Database,Cloud computing,Scalability
Conference
ISSN
Citations 
PageRank 
2639-1589
4
0.45
References 
Authors
7
8
Name
Order
Citations
PageRank
Wei Tang115210.65
Jared Bischof240.45
Narayan Desai331929.73
Kanak Mahadik4111.97
Wolfgang Gerlach5817.03
Travis Harrison6635.58
Andreas Wilke731423.84
Folker Meyer848451.83